guiguts.pl
Front end to gutcheck
with some other post processing functions as well.
Latest version:
guiguts.62.zip
- (487k) contains both a perl script and a windows executable. To run
the executable, you will need to download and install the perl runtime
libraries.Note: Needs at least prl02 The perl script can be run on an
install of perl 5.8.1 or
above with per/Tk installed. To use all of the advanced features,
it will need perl/Tk804.026 or higher. The perl runtime
libraries will allow you to run perl scripts against it as well as the
executable.
prl03.zip
- (5773k) perl runtime libraries - contains a full complement of perl
libraries with Tk804.026
installed to allow either perl scripts or the Gui* executables to run
on your system. Installation instructions and information on
compiling
your own package on this page.
This "manual", while it has some usage instructions, is pretty sketchy
and it is not particularly easy to find specific things. At this point
it is much more of a change log than a manual. David Cortesi, (dcortesi
on the DP site,) has written a much more user friendly manual. It does
not have a permanant home yet, but I have included a redirect page in
the guiguts package which I will attempt to keep current. The page is
included as an HTML file named ggmanual.html.
Written by Steve Schulze (thundergnat).
Questions or comments? Leave a message in the Distributed Proofreaders
forums or private message me as "thundergnat".
Also check out my pre processing toolkit guiprep.pl
This software has no guarantees as to it fitness to do this or any
other task. Any damages to your computer, data, your mental health or
anything else as a result of using this software are your problem and
not mine. If not satisfied, your purchase price will be cheerfully
refunded.
This program may be freely distributed, used, and modified. Reverse
engineering is condoned and encouraged. If you come up with some really
cool addition (or even just an idea) let me know, and it may be
included
in future releases. If you do reuse some of my code, I would appreciate
you mentioning it in the comments of your
script and dropping me a line to let me know.
CONTENTS
What
is it for?
Whats
new?
Background:
Installing
the script:
Using
the script:
A
few details:
How do I....
Hot
keys:
Known bugs and odd behavior:
Hey,
it doesn't work!
Change log history:
What
is it for?
This script is intended primarily to be a GUI front end to gutcheck,
the file checking utility written by Jim Tinsley through which all Project Gutenberg texts
must pass. Gutcheck is
(right now) only available as a command line
utility which produces its findings to the screen or to a capture file.
These results must then be compared against the text using a 3rd party
tool, namely some sort of text editor. This script is essentially a
text
editor that has been customized to work very closely with gutcheck to
help ease fix up. There are several other specialized functions
included
that are also helpful to someone trying to get a text formatted for
Project Gutenberg.
Guiguts also provides an interface for Jeebies: a command
line utility for detecting problems with very common and hard to find
he/be scannos in texts. Also written by Jim Tinsley. Get a copy from
its sourceforge page.
The script provides an interface to Aspell or Ispell. If you
have either program installed on your system, you can do full
interactive spell checking. Aspell seems to have better support under windows, and
is in general a more capable package. Both are free.
Whats
new?
Version
.62(487k) Ok, I give up. First there were two rounds of proofing
with two proofer names to keep track of. Then there were four rounds;
two proofers and two formatters, except when there wasn't--and except
at DPEU. Now there are five rounds, unless there are four, unless there
are three, except when there are two.... (Throws up hands.) Guiguts no
longer cares how many rounds there are. (Well, there is an artificial
limit to keep processing snappy, but I can increase it indefinitely by
changing one variable.) In practice, right now it is limited to
up to 8 rounds. It will adjust all of the proofer name display
functions to match the number of rounds in the open file. I
shuffled the order of the buttons in the proofer pop-up window slightly
to make it easer to compensate for variable round counts. It no longer
tries to figure out if a round is a proofing round or
formatting or whatever, it just keeps track of whose name is attached
to
which round and it is your problem to figure out what was done in
which. I did make the (perhaps rash) assumption that every page in a
project will go through the same number of rounds. If (when) some of
the specialty rounds that have been bandied about ever come to pass, I
may have to revise that assumption. I'll burn that bridge when we get
there.
When through the script and pulled out all references to the deprecated
%pagemarkers hash in the .bin file. The save functions no longer save
the %pagemarkers hash. It has been replaced by the more comprehensive
%pagenumbers hash. They have both coexisted for about 10 versions
(since .53) to give everyone a chance to get upgraded so that files
saved with different versions would be backward and forward compatible.
This shouldn't affect users at all unless you try to switch back and
forth between pre .53 and post .62 versions. If necessary, open and
save a file with one of the interim versions to make a "bridge".
Go to
Change log History
Background:
Guiguts came about because of my
frustration
with Proofreaders Toolkit, an older, no longer supported toolkit that
was used to prepare texts for Distributed Proofreaders. Proofreaders
Toolkit (PRTK) has a GUI front end to gutcheck built into it. It works,
but there are several things about it that are sub optimal.
Number 1. It was designed to work with an older version of
gutcheck. The command line options for gutcheck have changed slightly
since the PRTK was written, so it doesn't interface very well.
Number 2. A bigger problem, every time you make an edit to
the file, the list of gutcheck errors becomes unsynchronized and it
becomes hard to find subsequent errors that gutcheck reported.
I had previously written a preprocessing application called Prep to
do pre proofing checks on texts before they were uploaded to the site.
After 8 versions of Prep, I added a Gui front end to it to make it
easier to select options for processing. (There were some 30 or so
options and the command line was getting out of hand. There's over 70
now.) When I added the front end, I changed the name to Guiprep
to
differentiate it from command line Prep.
When the frustration level with PRTKs interface to Gutcheck grew
too much, I thought, "Heck, I could probably write something to do
that." and did so. When it came time to naming it, I thought "Well, I
already have Guiprep, a Gui front end to prep; this is a Gui front end
to gutcheck, I'll call it Guigutcheck. But that was too long, so I
shortened it to Guiguts. (which I find amusing, so that was a
big plus too.)
It has since grown to be a full featured text processing tool kit, of
which the gutcheck interface is a relatively small, (though still
important!) part.
Guiguts is written in perl to take advantage of it's very powerful
text processing functions and cross platform support. It will run on
Windows and Linux and Mac
OSX platforms, It unfortunately cannot be easily ported to Mac OS
9 and
earlier due to lack of some necessary perl modules for those OSs.
Since it is written in perl, the source is automatically available for
experimentation and hacking to anyone who is inclined to do so. There
is also a "compiled" windows executable version, (Winguts) included for
those who don't have a perl interpreter on their machine. Winguts
will need to have the perl runtime
libraries installed or a up-to-date perl install for it to work on your
system.
The script requires a perl
interpreter to run. The ActiveState
perl interpreter is probably the most popular for Windows users. (95,
98, 98se, ME, NT, 2K, XP) They also have versions available for
Linux and Solaris. It's very functional and free. (They do ask that you
register, but you can bypass the registration page without entering
anything if you like.) The 5.8.1 or later
distribution is necessary to run guiguts. For
Windows users, if you use the Microsoft Installer (MSI) version, it is
very
simple and automatic to set up. If you don't have Microsoft Installer,
a
link is included on the Activeperl download page.
Installing
the script:
Installation is pretty straight forward. The script comes packaged as a
zip archive file. Inside the zip, there is a directory named guiguts.
That directory
contains the script, several support files and several subdirectories.
Just select the directory you want to install guiguts in and unzip the
archive file into it. That's all that is necessary for installation.
There are a few caveats about which directory you should put it in,
however. Guiguts, though a graphical program, is written in perl which
is built on a command line foundation. (DOS for you Windows users.) As
such, it carries some baggage associated with that. Since it is built
over
DOS, you need to follow DOS naming conventions for the directory it
resides in. I.E. No directory names in the path with more than eight
characters (there's actually some wiggle room there but that's
essentially it.) and no directory names in the path with spaces in the
name. Something like C:\dp\ is ideal. Something like C:\Program
Files\ is not going to work. The script will refuse to run if you
install it in a directory with either of those properties. (Note: There
is much more leeway for using directories with long names under Win2k
and WinXP. The CMD.EXE command interpreter is much better at hiding
those from programs that aren't set up to deal with them. COMMAND.COM,
the command interpreter under Win98 and WinMe is not that capable
though.)
Ok, so you've got the script, you've installed it, what else? In order
to make full use of the script, you'll probably want to install a spell
checking package and some sort of image viewer. The script is designed
to work seamlessly with either Aspell
or Ispell.
Aspell seems to have better support
under windows, and
is in general a more capable package. Both are free. They both support
many different languages, just download the appropriate dictionary (Aspell win). Make sure you
select a dictionary compiled for the correct OS.
If you are working on a project from Distributed
Proofreaders, you will probably want some sort of image viewer.
The script is agnostic about what viewer to use. Whatever you have
available that you are comfortable with should work, as long as it will
let you pass a image name as a parameter to open it. (Nearly all do)
There are a few free ones that work quite well and have been
extensively used. Irfanview is a nice full featured
image viewer/processor. XnView is
another with very similar
capabilities. I tend to favor XnView
as a viewer because of its re sample on scale feature which makes
images much easier to read when shrunk to fit into a window smaller
than the image. (I prefer Irfanview for editing or processing images
however. What the heck, get both, they're free!)
If you are generating HTML versions of your files, there is now an
interface to HTML Tidy
built in to the editor. Handy for HTML checking and cleanup without
needing to drop out to another program.
There is a forum
thread on Distributed Proofreaders
dedicated to questions on setting up these programs.
Using the script:
In order to get full use of the Unicode functions, you
will need to have a font that has the Unicode characters defined. Many
of the Windows and Mac fonts have some Unicode support, though coverage
is spotty at best. There are some fonts with better than average
coverage available for various sources. This page
discusses several, with links to where they may be obtained. Two that I
find useful are: Bitstream Cyberbit, with 29,934 characters, available
as a free download here,
(Or Bitstream Cyberbase, which doesn't have the Korean/Japanese
characters) and Code2000, with 34,810
characters, available
here. Code2000
is a shareware font with a five dollar registration fee. If you like it
and use it I would encourage you to register it. Keep in mind that if
you have installed one of these large fonts, it allows you to see the
characters on YOUR computer. If someone else does not have the font,
(or at least a font that supports the characters you use,) they may not
be able to see what you see.
The script is basically an text editor with some specialized
functionality bolted on. The text editor module is actually fairly
comprehensive. It supports multiple levels of undo, lots of cut and
paste functionality and has many hot key combinations built in. It
provides a front end to gutcheck with automatic cursor placement at
errors/warnings. Unlike PRTK, the error list is tied to the text
itself, so edits early in the file will not cause pointers to later
errors to become invalid every time a line is added or deleted. There
are also quite a few other post processing functions built in to help
tidy up a text.
The first time that you run the gutcheck function, it will ask you
where your gutcheck.exe executable is. The latest version is included
in the gutcheck directory of the script folder.
The first time that you try to open an image through the script, it
will ask you where you image viewer is. Browse to wherever/whichever it
is and select the executable.
The first time you try to run a spell check, it will ask where the
executable for Aspell /Ispell is. Browse to it and select the
executable.
All of these may be set up or changed at any time under the Prefs menu
"Set File Paths"
When you start the script, a GUI window will open. There is a menu
across the top, a large text area and a status bar at the bottom. The
menu bar groups similar functions together on drop down menus,
pretty much like most other windowed applications.
A short description of the available menu selections:
File: (tear off
menu) - File
operations;
ubiquitous file stuff.
Open - Open a file. The script
will remember the last directory that you opened a file in and browse
from there.
--
Recently opened Files: 1 through 10 - Click on a file name to
open it.
--
Save - Saves the open file with
the current name. Short cut keys - Ctrl - s
--
Save As - Save the open file
with a different name.
Include - Open and insert
a file after the cursor location in currently open file.
Clear - Abandon the open file
but don't close the program
--
Guess Page Markers - If
you are
working with a file that has no page markers or already has the page
markers removed, you can
use this function to insert calculated page numbers. Not very accurate.
Set Page Markers - Set page
markers all at once using the page separators from the DP file. Will
allow you to use the image viewer button before you run the Page
Separator Fixup function.
--
Exit - Abandon the open file
and close the program. Will confirm discarding any unsaved changes.
Edit: (tear off
menu) - File editing
functions
Undo - Multi level undo. Track
back all of the changes made since saving the file.
Redo - Multi level redo. Undo
all your undos. A little buggy. Easier perhaps to make another change
to the file, then use undo again. It will undo all your undos.
--
Col Cut - Take the selected
column of text
and transfer it to the clipboard.
Col Copy - Copy the selected
column of text to
the clipboard.
Col Paste - Take the contents
of
the
clipboard and insert it at the cursor.
--
Select All - Select all of the
text in the window.
Unselect All - Select none of
the text in the window.
Search: (tear off
menu) - Search and
replace
functions.
Search & Replace - (Pop up
window). Search & replace functions. A fairly comprehensive search
engine. Allows use of regular expressions (regexes) while searching to
do some pretty complex searches.
Stealth Scannos - (Pop up
window). Search & replace functions with automatic loading of
stealth scannos. Basically an extension of the search window that
supplies built in search and replace terms. In the scannos
subdirectory, there are several files containing list of words that are
commonly mis scanned for another (en-common.rc) and a list of regexes
that are commonly used regex.rc. This function allows you to load the
search terms into the search window one by one automatically.
Spell Check - (Pop up
window). Spell check the open document if you have Aspell or Ispell
installed on your machine. If you select a portion of the document it
will only check that portion. If you don't select anything, it will
check the entire document. Spellcheck will save the document
before it runs if you have unsaved edits.
Goto
Line - Goto the specified
line. If you know what line you want to be on and don't want to scroll
and check, scroll and check, you can jump directly to it. (the current
line number is displayed at the bottom of the screen.
Goto Page - Goto the specified
page. Jump directly to the page number you enter. Only will work if
your file had page markers.
Which Line? - Find out the
line number of the line the cursor is on. Also available at bottom of
page in status bar. A little redundant perhaps.
Find next /*..*/ block - Find
next block of text with /*..*/ markup. Text surrounded by /* .. */
markup will be skipped when re wrapping. Useful for poetry, tables
etc.This function helps you easily cycle through the "non-rewrapped"
blocks of text.
Find previous /*..*/ block -
Find previous block of text with /*..*/ markup. Mate to the the
previous.
Find next /$..$/ block - Find
next block of text with /$..$/ markup. Text surrounded by /$ .. $/
markup will be skipped when re wrapping. Useful for poetry, tables
etc.This function helps you easily cycle through the "non-rewrapped"
blocks of text.
Find previous /$..$/ block -
Find previous block of text with /$..$/ markup. Mate to the the
previous.
Find next /P..P/ block - Find
next block of text with /P..P/ markup. Text surrounded by /P .. P/
markup will be formatted as poetry. This function helps you easily
cycle through the poetry
blocks.
Find previous /P..P/ block -
Find previous block of text with /P..P/ markup. Mate to the the
previous.
Find next indented block - Find
next block of text that is indented. (May or may not be in a marked up
block)
Find previous indented block -
Find previous block of text that is indented.
Find Orphaned Brackets -
Find
orphan or nested brackets. ( ), [ ], { } or < > or markup
/* */ , /# #/ or /$ $/. Sometime brackets are unpaired in
a text and it can be a real pain trying to find the unpaired bracket.
This special search function makes it easy. It will search for and
highlight any unmatched brackets / markup it can find.
Highlight double quotes in selection
- Highlight all of the double quotes in selection to find unmatched. A
special function to help find mismatched double quote in a selection.
(usually a paragraph). It can be very difficult to find missing quote
marks. This highlights them to make it easier to pick them out. Hot key
-> Ctrl -Shift - "
Highlight single quotes in selection
- Highlight all of the single quotes in selection to find unmatched.
Same thing for single quotes. Hot key -> Ctrl - '
Remove Highlights - Unhighlight
any highlighted text. Typically used together with the previous two
functions. Hot key -> Ctrl - 0
Bookmarks:
(tear off menu) - Set and jump to bookmarks in the text. You can keep
track of up to five places in the text and jump instantly to them using
these functions. They will be remembered from session to session. (Each
text will remember their own.)
Selection:
(tear off menu) - Perform operations that are typically done on a
selection of text.
lowercase Selection - Convert
the selected text to all lowercase.
Sentence case Selection -
Convert the selected text to sentence case. (First word capitalized,
the rest all lower case.)
Title Case Selection -
Convert the selected text to title case. (First letter of each word
capitalized)
UPPERCASE selection - Convert
the selected text to all UPPERCASE.
--
Surround Selection With -
Insert customizable text (default underscore (traditional Gutenberg
ASCII italic marker) )around
selected
text.
Flood Fill Selection With -
Overwrite selected text with customizable text (defaults to
space). Control+w hot key will just overwrite without popping up string
editing window.
--
Indent Selection 1 - Move the
selected text right by one space.
Indent Selection -1 - Move the
selected text left by one space, will not remove non whitespace
characters. You can destroy relative indenting if you continue to left
indent after the text is already in the first column.
--
Rewrap Selection - Rewrap the
selected text. Will skip any text inside /* */ or /$ $/ markup. Will do
block
indenting of text inside /# #/ markup.
Block Rewrap Selection - Rewrap
the selected text using the block rewrap margins. Will skip any text
inside /* */ or /$ $/ markup.
Interrupt Rewrap - Break into
a rewrap routine. (It can get pretty long for large files.) A small
window with an Interrupt button in it will pop up as well during the
rewrapping function.
ASCII Boxes - A special
function to automatically draw ASCII boxes around a selection of text.
Can be set to automatically rewrap the text and left, center or right
justify the text within the box. The drawing characters are selectable.
See an example here.
(will open in its own window)
Align text on string. - A
specialized function to help align columns of numbers or text that
should be aligned on a common character. (often a period/decimal point)
This function will let you specify a character to align on, then adjust
the indent of all the lines in the selection that contain the alignment
character so that they line up.
Convert To Named/Numeric Entities
- Convert anything in the selected text that isn't ASCII to HTML
Named/Numeric Characters.
Convert From Named/Numeric Entities
- Convert any HTML Named or Numeric characters in the selection to
Unicode.
Convert Fractions - Convert
any fractions that are within the Unicode standard to named or numeric
entities.
Fixup: (tear off
menu)
Specialized functions for post processing
Run Word Frequency Routine -
(Pop up window) Make a list of all of the distinct word with count of
how often they occur in the text. List them case sensitively or
insensitively (Polish polish), sort them by frequency or
alphabetically. Left click on the word to transfer the word to the
search function or just right click to search the text for that
pattern. There are several sub functions as well. These are more fully
explained in the "A Few Details" section.
--
Run Gutcheck - (Pop up window)
Run gutcheck against the file that is currently loaded in the text
editor. For speed, gutcheck is actually run against the file on the
disk rather than the text in the open window. Because of this, if the
file will be saved first if it has been edited, to
prevent it from running against a stale file. The first time gutcheck
is run, it will ask where the gutcheck executable is. There is a copy
included in the gutcheck directory under the guiguts directory. Browse
to it, select the executable and click OK.
Gutcheck options - (pop up
window) Change the behavior of gutcheck. Options -y (redirect stderr)
and -e
(don't echo lines) are set in the program and are not configurable,
(the script wouldn't work very well otherwise). Most other options are
available here and can be customized to your satisfaction. See the
gutcheck documentation for more information about specific options.
-v -
Enable verbose mode. Should be enabled for most purposes.
-t - Enable
check for common typos. Do some basic checking for common
typos/scannos. Enabled automatically if paranoid mode is enabled.
-x - Disable
paranoid mode. Relax checking rules. Not really recommended for most
purposes..
-p - Report
ALL unbalanced single quotes.
-s - Report
ALL unbalanced double quotes.
-m -
Interpret HTML markup. Will check line lengths as if markup was not
there. Automatically enabled if a threshold of HTML
entities are found in a file.
-l - Do not
report non DOS newlines. Unix and Mac use different newline
characters.This will suppress warnings about them.
--
Remove End-of-line Spaces -
Routine to remove all end-of-line spaces. Also run during Fixup
routine. A common gutcheck warning.
Run Fixup Routine - Routine to
do automatic fix up of a bunch of common problems. See paragraph below
in the "A Few Details" section.
--
Fix Page Separators - Functions
to help automate removal
page separators from DP texts and format the text around them.
Remove Blank Lines Before Page
Separators - Tidy up blank lines at page separators.
--
Footnote Fixup - Footnote
consistency checking and moving tools. Pop up window. See Details for
more info.
HTML fixup - HTML tools and
conversion. Pop up window. See Details for more info.
Sidenote Fixup - Sidenote
moving and checking tool. Will search for [Sidenote ] markup and
move any it finds to the beginning of the paragraph it finds it in.
Reformat Poetry Line Numbers -
Right justify and align poetry line numbers to the right of the text.
The line numbers must be to the last characters on the line and must be
separated from the text by at least two spaces.
Convert Windows Codepage
characters to Unicode - Convert characters in the hex
80-9F range to their Unicode equivalents. These characters normally
shouldn't show up in your text, but if a proofer cut and pasted the
page into Word, proofed it and then pasted it back, these character may
get inserted.
--
ASCII Table Special Effects -
Tools to adjust and reformat ASCII tables.
--
Clean Up Rewrap Markers -
Removes all of the /* */ & /# #/ markup from the text. Deletes the
entire line that the markup is on.
--
Add a Thought Break - Add
a standard Distributed Proofreader "Thought
break"
*
*
*
* *
Prefs:
(tear off
menu) - Set up program preferences.
Set Rewrap Margins - Adjust
margins for rewrap functions. The left margin function is NOT "leave
this many spaces" it is "Start in this column". So a left margin of 5
would leave four spaces before the text.
Font - (Pop up window) Adjust
font, font size and font weight. Viewable format only. Will not affect
file, just the viewing parameters.
Browser Start Command - Set the
startup command for your web browser. Probably best left as 'start'
under Windows. Enter the full path to the executable.
Set File Paths - Set the paths
to
the various support programs. (gutcheck, Aspell, tidy, image viewer,
pngs
directory)
Leave Bookmarks Highlighted - By
default, bookmarks are only highlighted when you set or jump to them.
Checking this will leave the bookmarks highlighted all the time.
Disable Quotes Highlighting - The
text editor can highlight pairs of quotes, double quotes, brackets or
parenthesis automatically when the cursor is placed between them. It
can be distracting though. This option disables it.
Keep Popups On Top - Many pop
up windows will stay on top of the main window, even if the focus
changes if this is selected.
Disable Bell - Many operation,
particularly searches, sound the system bell when warning about errors.
This disables the audible warning.
Auto Set Page Marks On File Open
- Toggle auto page marker set when file loads. Probably should be left
enabled unless you are working with very large files.
Toolbar Prefs - Enable/disable
the toolbar, and select which side of the editing window you would like
it to be on when you start the program.
Set Button Highlight Color - The
highlight color is the color that the button changes to when active.
The search window will flash the search button as a warning when no
terms were found using the highlight color. I picked a default color.
Change it here if you like.
Spellcheck Dictionary Select -
Shortcut to the Aspell dictionary select routine.
Toggle Auto Save - Enable or
disable automatic saving of the open file every interval of time.
Interval defaults to 5 minutes.
Auto Save Interval - Pop up a
box where you can adjust the interval between auto saves.
Toggle Auto Backups - Enable
or disable automatic backups when the file is saved. Works with both
user initiated and automatic saving. Saves the previous two iterations
of the file so you could roll back changes if desired.
Help:
(tear off
menu) - Various help items.
About - About the program
Versions - A pop up window
listing the version numbers of most of the software and libraries
involved with running guiguts. Mostly useful for troubleshooting.
Open Manual - Open the local
copy of this document
Check for updates -
Automatically connects to the server where the files are hosted and
checks if there is a more recent version available.
Hot keys - A pop up list of the
various hot key bindings
Function History - A history of
all of the functions that have been performed on a particular file.
Greek Transliteration - Pop up
a Greek transliteration chart.
Latin-1 Chart - Pop up a chart
of Latin-1 characters not on a US standard keyboard.
External:
(tear off
menu) - User configurable hooks to external programs.
0 - 9 - 10 slots where you can
set up external program calls.
Setup - Pop up window where you
can set up calls to external files and programs.
Unicode
- A drop down list of different Unicode character groups. Not
comprehensive, though most groups from 0100 through FFFF are
represented.
Sort by Name / Range - Change
the sort order that the Unicode blocks are displayed in the drop down.
A
few details:
Spell
check is almost completely
implemented at this point. If you have Aspell or Ispell installed on
your system, it will check through the file for misspelled words and
let you cycle through the file checking each in turn. It will pop up a
list of guesses for each misspelled word. Double click on a word to
move it to the replacement text box, then click change to replace it in
the text. (Or triple left click on a word to use it as a
replacement. It is modeless so you can go and edit in the
main window and then pick up where you left off with the spell check,
though you should probably avoid this if possible as it can confuse the spell checking
code. The Aspell package will learn from mistakes. If you have a word
misspelled in the text and use one of the replacement terms, the next
time it sees that misspelling, it will put the previously selected
replacement higher in the list of possibles.
The column cut,
and column copy functions are
a little
tricky to use. The selection highlighting does not stay within the
bounding box, so it looks like you are selecting entire lines. When you
use the column functions though, the only text that is actually
selected is whatever is within the bounding box formed by the upper
left selection point and lower right selection point. The selection
highlighting will extend past the actual selection.
The gutcheck
function will run
gutcheck against a copy of the file from your disc and pop up a window
with the list of
errors/warnings that it found. It may take some time to run, especially
on long files and/or slow computers. When you double click on an error
in the pop up window, the cursor in the text window will move to the
error, or as close to the error as possible. If you click again, the
focus will change to the text window. For some errors, the cursor may
not end up exactly at the reported errors location. This could be for a
variety of reasons. HTML markup changes how gutcheck reports column
locations so that could throw it off. Query's of very short or common
words may find an instance earlier on a line than the one actually
being queried. In general, it will nearly always get the line right,
and get the column right better than 3/4 of the time.
Handy Tip: While working in gutcheck,
for mismatched quotes errors,
click on the gutcheck warning 3 times. This will move the cursor to the
end of the paragraph with the mismatched quotes and change the focus to
the text window. Press "Control-Shift-Up Arrow" to select the paragraph
before the cursor, press "Control-Shift-Double quote" to highlight all
of the double quotes in the paragraph. Makes it much easier to pick out
the missing/extra quote marks. (Also works for "Control-Single Quote"
for single quote mismatches.) Press "Control-Zero" to remove all
highlights.
The word
frequency function
will build an index of all of the word in the file with a count of the
number of time each appears. By default, the word list is built doing a
case sensitive search, ("This" is different from "this") with the words
listed by frequency. You can change the search parameters to make the
search be case insensitive, and you can change the order of the list to
be sorted alphabetically or by the number of occurrences. When you left
click on a word in the list, it
will automatically copy it to the search text entry box if the search
pop up is open. If you right click on a word, it will automatically
search the text for that pattern. It will search using the options
selected on the search pop up. (case sensitive or not, whole word or
not). There are a few sub sort functions too, to allow you to easily do
some specialized word frequency sorts.
1st Harmonic - Will pop up a
window and display all of the words in a
text that are one edit away from the word selected. For instance, if
you select "the" and press First Harmonic, you might end up with the
list
"he,
she, She, the, The, them, then, they, thy, tie". These are all the
words in the text that can be gotten from the selection with only one
edit. (Well, you get the original word back, so one or less...)
The edit can be a replaced letter, a removed letter or an added letter,
but there can be only one edit. You must have selected a word in
the word frequency window or it won't return a list. Different texts
will get different lists. It doesn't return every POSSIBLE
variation in spelling, only those that are present in the text. There
is a hot key shortcut - Ctrl-left click. The harmonics window has the
same search bindings as the word frequency window (left click to pop up
the search window, right click to search with the current search
settings. You can also recursively do a harmonic search on a word in
the harmonics window. You must use the hot key to do so.)
All Words - Will display all of
the words found in the document. (Default display.)
Re Run - Clears all the word
hashes out of memory and runs a fresh sort on the file to pick up any
edits you may have made.
Check Emdashes - Will sort out
all of the emdash phrases in the text and display them with the
frequency that they occurred. If there is a word that is identical to
one of the emdash phrases except it has a hyphen instead of an emdash,
it will be displayed as
well with a string of asterisks next to it **** so it can be easily
picked out.
Check Hyphens - Will sort
out all of the words with hyphens in them and display them with the
frequency they occurred. If there is a word that is identical to one of
the hyphenated words only without a hyphen, or, is identical except it
has an emdash, it will be displayed as
well with a string of asterisks next to it **** so it can be easily
picked out. Makes it easy to find inconsistently hyphenated words.
Check Alpha/num - Will sort out
and display all of the words in a text
that contain a mixture of letters and digits. IE it will display 1st,
2nd, 23rd, 75c, etc. It will also display l86O, l9th, 0ddba11, grumb1e
and M1STAKE.
Check
spelling - This will run
a spell check on the file and return the list of "misspelled" word that
it found in the file. Note: at this time, Aspell will not handle
multi byte characters, so any words with multi byte characters are
filtered out of the file and added to the "misspelled" list unless they
are in the project dictionary. They may be spelled correctly, but
Aspell has no way of knowing (yet).
Ital/Bold Words - Will sort out
and display words and phrases that are marked up with bold or italics
markup. There is a four word threshold set by default. (Will not
display phrases longer than 4 words.) The threshold is adjustable by
right clicking on the Ital/Bold Words button.
Check All Caps - Will sort out
and display all of the words that have no lower case letters. (10TH
would be displayed, even though it contains digits, since it contains
no lower case letters.)
Check MiXeD CasE - will display
all word with a mixture of lower case letter and at least one upper
case letter not at the beginning of the word.
Initial Caps - will display
all word with an initial capital letter and no other upper
case letters.
Character counts - Much like
the name suggests, a list of the different characters that appear in
the text and how many times they appear. White space characters are
represented by their names rather than the actual character.
Check , Upper - will display
all phrases that have an upper case character after a comma
(comma/period error). Will search across lines. Terms that have a
newline in them will have the newline represented by \n.
Check . Lower - will display
all all phrases that have an lower case character following a period
(comma/period error). Will search across lines. Terms that have a
newline in them will have the newline represented by \n.
Check Accents - Will sort out
and display all of the word in the text that contain an accented
letter. If there are any words that are identical except they have an
unaccented letter, they will also be displayed with a string of
asterisks next to it **** so it can easily be picked out. Makes it easy
to find inconsistently accented words.
Unicode > FF - This will
sort out and display all words (words, not characters!) that contain
any Unicode character
greater than hex FF (decimal 256). Characters FF (256) and lower are in
the Latin-1 character set.
Stealtho
check - This is
another way to check for scannos in the file. You can use the
en-common.rc file to get the commonly misscanned for each other words,
or you could use the misspellings.rc word list which will looks for
the 3500 or so most common misscanned words. Both files are in the
scannos directory.
The
Fixup function will comb through
the text and make innocuous
and easily automated repairs to the text. It will pop up a window and
allow you to customize which checks you want to perform.
As of now, the fixes it will perform are:
• Remove spaces at end of line.
• Remove spaces on either side of
hyphens.
• Remove space before periods.
• Remove space before exclamation
points.
• Remove space before question marks.
• Remove space before commas.
• Remove space before semicolons.
• Remove space after opening and
before closing brackets.
• Remove space after open angle quote and
before close angle quote.
• Remove space after beginning and
before ending double quote.
• Ensure space before ellipses except
after period.
• Format any line that contains only 5 *s and whitespace to be the
standard 5 asterisk thought break.
• Convert multiple space to singe space.
• Fix obvious l<-->1 problems.
You can also specify whether to skip text inside the /* */ markers or
not.
Fix
Page Separators will pop up
a window with several buttons on it to help automate removal of page
separators from files from Distributed Proofreaders.
The Buttons are:
•Join Lines - join the lines on
either side of the separator, removing
any blank lines, spaces, asterisks and hyphens as necessary. - Hotkey
->
j (Notice: this will remove any leading hyphen, spaces and
asterisks from the line after
the separator as well.)
•Join, Keep Hyphens -
join the lines on either side of the
separator,
removing any lines, spaces and asterisks necessary. - Hotkey -> k
•Blank Line - remove the
separator, leave one blank line. (paragraph
break) - Hotkey -> l
•New Chapter - remove the
separator, leave four blank lines. (chapter
break) - Hotkey -> h
•Refresh - find, center and
highlight the next page separator -
Hotkey -> h
•Undo - automatically back out
of all of the changes made for the last separator edit -
Hotkey -> u
There are also some check boxes to control some of the functions.
•Full Auto will
automatically search for the next page
separator as soon as one has been done and try to automatically process
it if it can. - Toggle state - a
•Semi Auto will
automatically search for the next page
separator as soon as one has been done and wait for you to select an
operation. - Toggle state - s
HTML Fixup:
Pops up a button
bar which has most of the popular HTML markup on it; at
least, all that is easily translatable to TEIlite. Will automatically
insert the
selected markup around the selected text. Some markup buttons act
differently depending on what text is selected. There is an
Autogenerate HTML function that will do basic conversion to an HTML
version. The available markup and functions:
Autogenerate HTML: Will do
basic conversion to HTML. Will attempt to make links to out-of-line
footnotes if found. Will preserve line breaking in text marked with
/*..*/ (there needs to be a blank line before the open and after the
close delimiters.) Will try to preserve indenting, not real elegant,
but it
trys. Will automatically add HTML header and footer if not there
already.
Custom Page Labels: Configure
page numbers independantly of image numbers.
Auto Illus Search:
Automatically search for [Illustration: markers and interactively allow
you to select images to display there.
Pg #s as comments - Insert the
page numbers as HTML comments.
Pg #s as anchors - Insert the
page numbers as internal HTML links (anchors).
<i>
Italics
-
Insert <i> </i> around the selected text, removing any that
may be in the selection.
<b> Bold -
Insert <b> </b> around the selected text, removing any that
may be in the selection.
<u> Underline -
Insert <u> </u> around the selected text, removing any that
may be in the selection.
<center> Center -
Insert <center> </center> around the selected text,
removing any that
may be in the selection.
<h1>Header 1 -
Insert <h1> </h1> around the selected text.
<h2>Header 2 -
Insert <h2> </h2> around the selected text.
<h3>Header 3 -
Insert <h3> </h3> around the selected text.
<h4>Header 4 -
Insert <h4> </h4> around the selected text.
<h5>Header 5 -
Insert <h5> </h5> around the selected text.
<h6>Header 6 -
Insert <h6> </h6> around the selected text.
<p> Paragraph -
Insert <p> </p> around the selected text.
<br> Line break -
Insert <br> at the end of each line of the selected text,
or at the cursor if no selection is made.
<hr> Horizontal line
-
Insert <hr> before the selected text, or at the cursor if
no selection is made.
Non breaking space
- Replace space with wherever there are two or more
adjacent spaces in the selected text, or at the cursor if no
selection is made.
Poetry - Will automatically
insert markup used in /p p/ blocks in the selection.
<big> - Insert
<big> </big> around the selected text
<small> - Insert
<small> </small> around the selected text
<ol>
Ordered list -
(numbered list) Insert <ol></ol> around the selected text.
Also need to define list items inside it.
<ul> Unordered list
-
(bulleted list) Insert <ul></ul> around the selected text.
Also need to define list items inside it.
<li> List item -
Insert <li></li> around the selected text. A list item.
Defaults to bulleted unless surrounded with <ol></ol>.
<sup> Superscript
-
Insert <sup> </sup> around the selected text.
<sub> Subscript
-
Insert <sub> </sub> around the selected text.
<table> Table
-
Insert <table> </table> around the selected text. Will need
to define rows and columns.
<tr> Table row
-
Insert <tr> </tr> around the selected text. Meaningless
without <table></table> markup.
<td> Table column or cell
-
Insert <td> </td> around the selected text. Meaningless
without <table></table> markup.
<blockquote> Block quote
-
Insert <blockquote> </blockquote> around the selected
text. (Indented from each margin)
<code> Block quote
-
Insert <code> </code> around the selected text.
(mono spaced font)
Named anchor: -
Insert an internal anchor (for a link) before the selected text, using
the selected text as the name.
Image: -
Insert an image anchor before the selected text, using the selected
text as the alternate text. Will ask for image directory first time and
remember it.
Named anchor: -
Insert an internal anchor (for a link) before the selected text, using
the selected text as the name.
External link: -
Insert a link around the selected text, using the selected text as the
link text, to an external file or location.
Internal link: -
Insert a link around the selected text, using the selected text as the
link text, to an previously created Named anchor.
External link: -
Insert a link around the selected text, using the selected text as the
link text, to an external file or location.
Remove markup from selection:
- Remove all HTML markup from the selected text, will warn if it
leaves orphans as a result.
Find orphan markup:
- Search for all HTML markup that is opened but not closed or
closed but not opened.
Auto list - Will automatically
place list markup in a selection. each line will be treated as a list
item.
Auto table - Will automatically
place table markup in a selection. each line will be treated as a row,
two or more spaces between items will denote a cell.
div - Insert a div around a
selection. Will use the style in the entry box to the left.
span - Insert a span around a
selection. Will use the style in the entry box to the left.
Header: Insert HTML header and
footer. You can customize the header in the external file "header.txt"
located in the guiguts directory.
Find and Format Poetry Line #s - Mark up any poetry line numbers it
finds with appropriate markup during autogenerate.
Footnote
Fixup:
The way the function works: Open the Footnote Fixup routine under the
Fixup menu. Right in the middle of the dialog window that opens, is a
button called First Pass. This will comb through the file and find
everything it thinks is a footnote. Once this is finished, you need to
manually
step through the footnotes it found and check each one to make sure it
has no errors. (Missing open or closing bracket usually) If a bracket
is missing, the footnote highlighting will extend beyond its
boundaries. You will need to add the enclosing bracket, then hit Adjust
Bounds, to re-search for the limits of that footnote.
If the footnote
was left as an out-of-line footnote by the proofers, it will try to
find the anchor in the text, if possible. The tool depends on the
footnote being formated "[Footnote xx:" where xx is the footnote
letter/number/symbol. If the footnote marker is formatted with the
number/symbol following
the colon, "[Footnote: xx" it will not be able to identify it. In these
cases, you must tell the script where to set the anchor. You will get
different behaviors depending on whether you elect to do In-line or
Out-of-Line footnotes. For inline footnotes, put the cursor at the
point where you want the anchor to be and press Set Anchor. If there is
an existing anchor, it will be deleted and the footnote will be moved
to where the anchor was. If there was no anchor, the footnote will be
moved to the new anchor point (where the cursor is). For Out-of-line
footnotes, if there is an anchor, select Number, Letter, or Roman to
select the symbol type. If an anchor exists, it will change to the next
available symbol of that type. If no anchor exists, it will add an
anchor of the selected type at the present cursor location. **Notice,
you may get duplicate footnote symbols. That will be fixed in the re
indexing step.**If you have a footnote that has been broken across a
page, you can use Join With Previous to automatically rejoin the two
halves.
Once you have stepped through and checked, adjusted, fixed and
anchored all of your footnotes, hit the button Re Index. For
inline footnotes, this will go through and delete any
remaining anchor markers and move the footnotes into place if
necessary. For out-of-line, it will renumber all of the footnotes using
the same family of symbol that it had originally or a number if it had
no anchor marker. This will close up any gaps in the numbers and remove
duplicates. You can make changes and re index as often as you like.
Once that is finished, inline footnotes are done. Out-of-line footnotes
will need to have a place (or places) selected for the footnotes to be
moved to--end of text, end of each chapter, whatever you want. There
are two buttons to automatically set landing zones at the end of every
chapter or at the end of the text. Alternatively, you can manually
select where you want the footnote landing zones to be. Put the
cursor where you want footnotes to be moved to and press Set Landing
Zone. This will insert the marker text "FOOTNOTES:" at that spot. The
footnotes between that landing zone and the previous one will be moved
to just past that marker. You can have as many landing zones as you
like, and can step through them adding and removing as necessary.
After your footnotes have been relocated, you can redo a first pass to
check that they are all correct. You will probably need to check
Unlimited Anchor Search (This keeps the script from searching before
the present page, more or less, to keep from finding anchors from
previous footnotes early on when there are probably multiple footnote
1s.) You can easily view the anchor and footnote by pressing the
appropriate button.
A handy hint: After the first pass has been done but before second,
run the Word frequency routine, sort alphabetically, and check to see
that you have the same
number of occurrences of Footnote in word frequency as you do in the
Footnote fixup box. If not, you may have a problem in the text,
probably a footnote with missing or incorrect brackets.
Hot
keys:
There are many, many functions available through hot key combinations
as
well. Here is a fairly complete list.
<ctrl>-x -- cut
<ctrl>-c -- copy
<ctrl>-v -- paste
<ctrl>-a -- select all
<ctrl>-s or <ctrl>-S -- save file
<ctrl>-f or <ctrl>-F -- pop up search
window
F1 -- column copy
F2 -- column cut
F3 -- column paste * Notice: column paste should only be used
to paste a column onto lines that already contain text. To paste a
column of text onto blank lines, use standard paste: <ctrl>-v
<ctrl>-u --
convert selection to upper case
<ctrl>-l -- convert
selection to lower case
<ctrl>-t -- convert
selection to title case
<ctrl>-i -- insert a tab
character before cursor (Tab)
<ctrl>-j -- insert a
newline character before cursor (Enter)
<ctrl>-o -- insert a
newline character after cursor
<ctrl>-d -- delete
character after cursor (Delete)
<ctrl>-h -- delete
character to the left of the cursor (Backspace)
<ctrl>-k -- delete from
cursor to end of line
<ctrl>-z -- undo
<ctrl>-y -- redo
<ctrl>-e
-- move cursor
to end of current line. (End)
<ctrl>-b
--
move cursor left one character (left arrow)
<ctrl>-p -- move
cursor up one line (up arrow)
<ctrl>-n --
move cursor down one line (down arrow)
<ctrl>Home -- move cursor
to the start of the text
<ctrl>End -- move cursor
to end of the text
<ctrl>-right arrow --
move to the start of the next word
<ctrl>-left arrow -- move
to the start of the previous word
<ctrl>-up arrow -- move
to the start of the current paragraph
<ctrl>-down arrow -- move
to the start of the next paragraph
<ctrl>PgUp -- scroll left
one screen
<ctrl>PgDn -- scroll
right one screen
<shift>-Home -- adjust
selection to beginning of current line
<shift>-End -- adjust
selection to end of current line
<shift>-up arrow --
adjust selection up one line
<shift>-down arrow --
adjust selection down one line
<shift>-left arrow --
adjust selection left one character
<shift>-right arrow --
adjust selection right one character
<shift><ctrl>Home
-- adjust selection to the start of the
text
<shift><ctrl>End
-- adjust selection to end of the
text
<shift><ctrl>-left arrow
-- adjust selection to the start
of the previous word
<shift><ctrl>-right arrow
-- adjust selection to
the start of the next word
<shift><ctrl>-up arrow
-- adjust selection to the
start of the current paragraph
<shift><ctrl>-down arrow
-- adjust selection to the start
of the next paragraph
<ctrl>-/ -- select all
<ctrl>-\ --
unselect all
<Esc> -- unselect all
<ctrl>-' -- highlight all
apostrophes in selection.
<ctrl>-" -- highlight all
double quotes in selection.
<ctrl>-0 -- remove all
highlights.
<Insert> -- Toggle insert
/ overstrike mode
Double click left mouse button
-- select word
Triple click left mouse button
-- select line
<shift> click left mouse button
-- adjust selection to click point
<shift> Double click left mouse
button -- adjust selection to include word clicked on
<shift> Triple click left mouse
button -- adjust selection to include line clicked on
Single click right mouse button
-- pop up shortcut to menu bar
<alt>-left arrow
-- move selection left one space
<alt>-right
arrow
-- move selection right one space
BOOKMARKS:
<ctrl>-<shift>-1 --
set bookmark 1
<ctrl>-<shift>-2 --
set bookmark 1
<ctrl>-<shift>-3 --
set bookmark 3
<ctrl>-<shift>-4 --
set bookmark 4
<ctrl>-<shift>-5 --
set bookmark 5
<ctrl>-1 -- go to
bookmark 1
<ctrl>-2 -- go to
bookmark 2
<ctrl>-3 -- go to
bookmark 3
<ctrl>-4 -- go to
bookmark 4
<ctrl>-5 -- go to
bookmark 5.
MENUS:
<alt>-f -- file menu
<alt>-e -- edit menu
<alt>-r -- search menu
<alt>-b -- bookmark menu
<alt>-s -- selection menu
<alt>-x -- fixup menu
<alt>-p -- preferences
menu
<alt>-h -- help menu
Known bugs and odd behavior:
While doing search and replace, be careful when doing bulk editing on
words that may be part of another word with an apostrophe extension,
i.e. won, won't, Mike, Mike's etc... Due to the way the script
recognizes words, it is very difficult to ignore single quotes yet
account for apostrophes, especially since they are being represented by
the same character. Actually, same problem with accented characters.
It's is all due to the semantics of what perl considers to be a word
character or not.
Unicode support is
non-existent. The Tk text widgets I am using just
don't understand Unicode, so there isn't any way for me to implement it
with the current widget set. There are rumors that the next major
release of perl/Tk will support Unicode, but until it emerges, I won't
know for sure.
Unicode is now substantially supported if you are running Tk804.025 or
higher. There is still a problem of characters not being fully
supported by various fonts, but that is beyond my control. Many fonts
support substantial subsets of Unicode, and there are a few that
support quite a bit, but you can't rely heavily on any particular
character necessarily being available.
Accented character handling is broken. Again, a limitation of the
perl/Tk text widget. It knows about the existence of accented
characters
at least, but doesn't know enough about them to treat them as word
characters. Leads to some odd failures when doing regex
operations. I have tried to work around the problem as much as
possible, but you will still find some oddities here and there.
This is no longer as true as it was once. Accented character
support is still subtly broken, but I have managed to work around it
pretty smoothly. There are still a few minor caveats, but it is not as
serious as it was before.
The Regex search
engine doesn't recognize the newline
assertion \n.
This is perhaps the biggest and most distressing drawback to the
perl/Tk text widget. It has two major and serious implications for
doing regex operations.
1) You can't use newline assertions in regex search and replace
operations. They won't work. The perl/Tk text widget just doesn't
understand them.
2) You can't search for strings that cross over a line boundary.
Perl/Tk basically treats a text as an array of separate text strings. A
string that crosses a line boundary will not return as a hit when
searching for that string. This is a fairly serious drawback, but can
be worked around to some extent if you are aware of it.
I have neatly sidestepped the issue by bypassing the PerlTk text widget
for any regex search that contains a \n assertion. It works pretty well
though it is not completely seamless. It is probably close to as good
as
it is going to get however.
When saving a file
with Unicode characters, the console window will
complain "Wide character in print" for EVERY line that contains a wide
(2 byte) character. This is harmless, it saves correctly, it just
complains about it. I haven't figured out how to suppress this yet.
Never mind. I figured out how to suppress the warnings.
While doing regex search and replace with variable capturing, you can't
use the zero width positive look ahead and look behind in the search
term. It a side effect of the way I have to do the regex handling
to work with the Tk text widget. You can use them while searching, but the replacement term
will not see any of the captured variables. A work around is to just capture the forward or reverse term
in its own set of parenthesis and just add it back in to the
replacement term. Negative lookaround assertions should be ok to use
though.
This is still not perfect, though much improved. In
general, you can now use positive lookarounds, EXCEPT if they
contain a literal closing parenthesis. In that case, just capture the
parenthesis and add it back in.
Hey, it doesn't work!
When you run winguts, if you only get a DOS box that flashes up on the
screen flashes some text and then disappears, there are three possible
things that could cause that.
1) A bad directory name/path - Guiguts,
though a graphical program, is written in perl which is built
on a command line foundation. (DOS for you Windows users.) As such, it
carries some baggage associated with that. Since it is built over DOS,
you
need to follow DOS naming conventions for the directory it resides in.
I.E. No directory names in the path with more than eight characters
(there's actually some wiggle room there but that's essentially it.)
and no directory names in the path with spaces in the name.
Something
like C:\dp\ is ideal. Something like C:\Program Files\ is not going to
work. The script will refuse to run if you install it in a directory
with either of those properties. (Not as true under WinXP.)
2) Executable no longer in right directory
- The executable relies on being able to find several external modules
and libraries in a predetermined relative location. If the executable
is moved from the winguts directory, it will not be able to find the
external files it needs and will not run. If you want to run it from
your Desktop, Menu, Quick Launch bar, whatever, make a shortcut to it
and move that, but leave the executable where it is.
3) Corrupted or missing libraries/modules
- There are a whole bunch of external modules and .dll files that the
executable needs to run. If they are missing or corrupted, it will
refuse to run. Make sure you've downloaded and installed the perl
runtime libraries, and that the prl directory is in your path.
How do I....
Open a file?
- Select Open from the File menu at the top of the program
window. Or if you have previously opened a file, you can simply click
on the name in the recently opened files in the same menu. If your are
running winguts, you can associate the extension
.ggp (guiguts project) with winguts and if your files are named with a
.ggp extension, you can just double click on the file to open in
winguts.
Save a file?
- Select File -> Save from the menus at the top of the program
window. Alternately, Ctrl-s will save the file. Some functions will
automatically save the file when they are run if there have been edits.
(gutcheck, word frequency) Saving the file clears the undo buffer. Use
Save As if you want to save with a different name. If saving with a
different name, it is generally recommended to not JUST change the
extension. There are several files of external information (page
markers, bookmarks, function history) that are saved with the same file
name but different extensions. If you have two (or more) files in the
same directory with the same base name but different extensions, it
will
cause collisions in the external info files.
Append a
file? - Place the cursor where you want the file to be added,
then
select File -> Insert from the menus at the top.
Abandon
changes? - Use File-> Clear to clear the current file from
memory, it will ask if you want to save any edits that have been made.
Quit? - Use
either File -> Exit or click on the program close button in the
program frame. Will ask if you want to save any edits.
See
the page images? - To see the page images, you will need to have
an image viewer, the page images in an accessible directory, (default
is
pngs directory under the directory the text file is in.) and have the
page markers set. If the file is from Distributed Proofreaders, it will
have page separator markers in it. You can set the page markers
automatically while running Fixup page separators or set them instantly
by using File-> set page markers.If your file has no page separators
in it, you can use Guess page markers to set some markers which will be
close to the correct page, (you won't have to search very far at
least.) Once markers are set, the current page number and a
button marker SEE will appear in the bottom status bar. Clicking on SEE
will open your image viewer to the image corresponding to the current
page.
Set
page image markers? - Page markers will automatically be set as
you run the Fixup Page Separators function. If you want to set them
immediately, before the Page Separator function is run (recommended)
use
File -> Set Page Markers.
Set a
bookmark? - Either use the Bookmarks menu item or Ctrl-Shift-(1
- 5) for up to five bookmarks per file.
Go to
a bookmark? - Either use the Bookmarks menu item or Ctrl-(1 - 5)
Run
gutcheck? - Under the fixup menu you can select the gutcheck
run options and run gutcheck.
Do
bulk case adjustment? - Under the Selection menu you can do bulk
adjustment of case. Switch the selected text to all lower case, all
upper case, sentence case (first word capitalized), or title case (each
word initial caps) .
Do
bulk indenting? - Under the Selection menu you can move the
selected text right or left one space with one click. When moving text
left, it will not remove non whitespace characters. An easy way to
remove relative indenting is to continually move the text left until it
is all at the left margin.
Rewrap
the text? - Under the section menu there are two rewrap
functions, Rewrap text and Block Rewrap text. Rewrap (by default) will
rewrap the text from column 1 to column 72. It will remove any double
spaces or trailing spaces in the text. Any text within /* */
markup will be ignored by the rewrap function. The /* markup should be on a line by itself
and there must be a blank
line
before the markup. The */ markup should
be on a line by itself and there must
be a blank line after the markup. */* markup will be treated as
/*
since, at worst it won't
rewrap something that should
be, rather than
rewrapping something that shouldn't be. If you want to have a
relative amount of indenting without rewrapping, you can use /* markup
with an indent modifier. Markup with a indent
modifier will adjust the indent in the block
so that the left-most line will be set to have the indent specified and
all
other lines will be adjusted to keep their same relative indent. The modifier is an
absolute indent. Negative numbers will be ignored.
For example:
/*[4]
text text text
text text text
text text text
text text text
*/
Would become:
/*[4]
text text text
text text text
text text text
text text text
*/
4 spaces before the left most line, relative indenting maintained.
You can call the rewrap function quickly with Alt-s-r (Press Alt s then
r without letting go of the Alt key)
Text inside /# #/ markup will be rewrapped using the Block Rewrap
margins. Block rewrap (by default) will rewrap text from column 5 to
column 72. The same rules apply to block rewrap markup as non rewrap
markup. The markup should be
on a line by itself
and there must be a blank
line
before the open and after the closing markup. There are ways to
override the block markup defaults. If you
put margin numbers on
the opening line, it will use those numbers for the margins instead of
the defaults.
They must be formatted thus: ( /#[x.y,z] ) The first
number is the general left margin override. ( /#[x] ) It will indent
all of the lines x spaces. If a there is a period and a second number,
( /#[x.y] ), the first line will be indented y spaces and the
rest x. If there is a comma followed by a number, ( /#[,z] ), it will
override the default right margin setting. You can override the margins
in nearly any combination. If you override the first line (y) you will
need to have a x value, otherwise the y will be used for all of the
lines, and if you have both a left margin and right margin setting, the
left margin needs to come before the right. - /#[,z.yx] won't
work, at least not like you'd expect.
For example:
/#
Text text text
text text text
#/
will be indented and rewrapped using the standard block rewrap margins.
/#[6,53]
Text text text
text text text
#/
will block rewrap with a left margin of 6 and right margin of 53
instead.
/#[2]
Text text text
text text text
#/
will use a left margin of 2 and a standard block wrap right margin.
/#[4.6,70]
Text text text
text text text
#/
Will have first line margin at 6, the rest of the
lines at 4, and
wrap after column 70.
And so on.
You can call the block rewrap function quickly with Alt-s-b(Press Alt s
then b without letting go of the Alt key)
The markers /p ..p/ have special meaning to the rewrap function and the
HTML autogenerate function. In rewrap, text inside /p p/ will be
treated the same as text with the markup /*[4] */. In other word, it
will get absolute 4 spaces indent on its left most line and maintain
relative indents.
During HTML Autogenerate, text in the /p p/ markup will use special
poetry markup styles.
The markers /f f/ are special "Front material" markup. They are meant
to
be used to enclose the title, author, publishing data, etc. at the
front of a text. During rewrap, they will be treated the same as /$ $/;
no rewrap, no indent. During HTML autogenerate, they will
automatically ensure that the front material is centered.
Adjust
the rewrap margins? - Under the Prefs menu there is a selection
where you can change the default rewrap margins for both standard
and block rewrap. There are some common sense limitations on the
allowed margins; there must be a number selected for each, and the
right
margins can not come before the left margin.
Check for mismatched
(orphaned) brackets? - Under the Search menu there is a function
dedicated to finding mismatched brackets an rewrap markup. Select which
to search for, press search. It will find all of the suspect markup and
let you cycle through and check each. (press next)
Check for mismatched
(orphaned) HTML markup? - Under the Fixup menu, select
HTML fixup. Near the bottom of the window that pops up as a button
"Find
orphaned markup".
Remove
trailing blanks? - There is a dedicated function under the fixup
menu, or it can be run as part of the fixup function under the fixup
menu, or you could do a regex search and replace '\s+$' => '',
(search for 1 or more spaces at the end of a line and replace with
nothing), and perform a replace all.
Find / fix spaced
hyphens/em dashes? - The Fixup function under the Fixup
menu has a sub function which will attempt to fix all spaced hyphens
and
em dashes it finds. Running gutcheck will also find any spaced hyphens
or em dashes.
Check for consistent
hyphenization? - Under the Fixup -> Word frequency function
there is a sub function that will sort
out all of the words with hyphens in them and display them with the
frequency they occurred. If there is a word that is identical to one of
the hyphenated words only without a hyphen, it will be displayed as
well with a string of asterisks next to it **** so it can be easily
picked out.
Check
for consistent accents? - Under the Fixup -> Word frequency
function there is a sub function that will sort out
and display all of the words in the text that contain an accented
letter. If there are any words that are identical except they have an
unaccented letter, they will also be displayed with a string of
asterisks next to it **** so it can easily be picked out.
Check for unusual or
discouraged characters? - Under the Fixup -> Word frequency
function there is a sub function that will make a list of the different
characters that appear in
the text and how many times they appear. White space characters are
represented by their names rather than the actual character. Easily
check for mis matched brackets, upper ASCII, tabs, etc.
Check for unusual
capitalization? - Under the Fixup -> Word frequency function
there are the sub functions Check All Caps and Check Mixed Case.
Check All Caps will sort out
and display all of the words that have no lower case letters. (10TH
would be displayed, even though it contains digits, since it contains
no lower case letters.)
Check MiXeD CasE will display
all words with a mixture of lower case letter and at least one upper
case letter not at the beginning of the word.
Check
spelling? - Either under the Search menu or in the word
frequency routine. Search -> Spell Check will spell check like a
traditional spell checker. I.E. it will scan the file for unrecognized
words, highlight them in the text and suggest several possible
replacements. It will learn from mistakes so that if you have a word
misspelled the same way several times and you use a particular
replacement, it will be moved higher in the list of possible
replacements for subsequent occurrences.
Under the Word Frequency
routine, it will scan the file, then return a list of unrecognized
words
that you can quickly look through to get an idea of how spellcheck
intensive the file will be. It also makes it relatively easy to
quickly pick out obviously misspelled words rather than just
unrecognized ones.
Check
for scannos? - Either under the Search menu or in the word
frequency routine. Search Stealth Scannos will pop up a modified search
window that will allow you to load predefined pairs of words that are
commonly misscanned for another and easily cycle through them. The word
pairs files are in the scannos directory under the guiguts directory.
Under
the Word Frequency routine, it will scan through the file and
pick out the words that appear in the scannos word pair files so you
can quickly look through the list. You can also load a file called
misspelled.rc, which contains the top 3500 or so most common misscanned
letter combinations.
Right justify poetry
line numbers? - Under the fixup menu. This function will move
any numbers that are the last characters in a line and separated from
any other text by at least two spaces over against the right margin
(as specified by the rewrap right margin.)
Easily find mismatched
quotes? - A common error in gutcheck is mismatched quotes. To
quickly find them,
click on the gutcheck warning 3 times. This will move the cursor to the
end of the paragraph with the mismatched quotes and change the focus to
the text window. Press "Control-Shift-Up Arrow" to select the paragraph
before the cursor, press "Control-Shift-Double quote" to highlight all
of the double quotes in the paragraph. Makes it much easier to pick out
the missing/extra quote marks. (Also works for "Control-Single Quote"
for single quote mismatches.) Press "Control-Zero" to remove all
highlights.
Use the ASCII Box drawing
tool? - This will draw ASCII art boxes around selected text. The
selection MUST start and end on a blank line. You can change what
character are used for drawing by changing them in the ASCII Boxes pop
up window. You can adjust the size of the boxes (default 64 wide). If
you elect to rewrap the text as you draw, it will rewrap the text to
fit inside the box with a minimum of one space between the text and the
frame (default rewrap 60). You can choose to left justify, center or
right justify the text within the box. (If you do use ASCII boxes
for a Gutenberg text, it is recommended that they be indented at least
two spaces to prevent rewrapping during whitewashing.) See an actual
example here.
(will open in its own window)
Keep track of what I
have done with a file? - Under the Help menu there is a Function
history that keeps track of what major functions have been performed on
a file with a time stamp.
Enter
accented characters? - Under the Help menu the is a Latin-1
function that will pop up a little window with all (well, most)
of the 8 bit Latin-1 characters. Click on a character to
insert it at the cursor.
Transliterate
Greek passages? - Under the Help menu the is a Greek
transliteration function that will pop up a
little window with all common Greek character glyphs.
You can select to get Latin Characters transliteration, Greek character
names or HTML codes as output. Click on a glyph to insert the
resulting code at the cursor.
There is a newer, more comprehensive transliteration scheme based on
beta encoding available for those who want to preserve more information
about accented characters. For unaccented characters, the
transliteration is the
same as the Perseus method (What we use on the site and guiguts has
used up to now.) Beta encoding provides a method to preserve the
accents. There are basically eight accents that you need to deal with
for Greek, they are detailed below: (You will need a Unicode aware font
to view the examples in the chart.)
Popular
name
|
Greek
name
|
symbol
|
example
|
encoded
|
| rough
breathing mark |
diasia
|
(
|
ἁ
|
a(
|
soft
breathing mark
|
psili
|
)
|
ἀ
|
a)
|
acute
|
oxia
|
/
|
ά
|
a/
|
grave
|
varia
|
\
|
ὰ
|
a\
|
iota
subscript
|
prosgegrammi
|
|
|
ᾳ
|
a|
|
tilde
(or inverted
breve, depending on the font)
|
perispomeni
|
~
|
ᾶ
|
a~
|
diaeresis
(rare)
|
dialyctika
|
+
|
ϋ
|
y+
|
breve
(rare)
|
vrachy
|
=
|
ᾰ |
a=
|
macron
(very rare)
|
macron
|
_
|
ᾱ
|
a_
|
To encode a character in beta code, transliterate the base character
as normal. Then, starting from the highest point, working from left to
right, place the symbols for the various accent marks after the base
character. Stack as many accent symbols as needed to make the
character. IE: ᾭ would be Ô(/|.
There is a utility box at the bottom of the Greek
transliteration window to help assemble accented Greek characters.
Select the type in the base character and select the accents you want
from the list and press
enter to place the character in the transliteration window.
Use
tear off menus? - When you click on one of the menu items at the
top of the program window, there is a dotted line separator up near the
top. Click on the dotted line to "tear off" the menu and leave it open
on your desktop. Especially useful when doing bulk indenting under the
Selection menu.
Make the displayed text
bigger/smaller/a different font? - Under the Prefs menu, select
Font This will allow you to modify many of the display properties to
suit your preferences. Will not have any effect on the text files,
only affects the display properties.
Change
Aspell dictionaries? - Start spell check from within guiguts.
Click on the options page. The list box in the center of the window
that pops up lists all available dictionaries. The currently loaded
dictionary is listed at the bottom. double click on a dictionary to
switch to that dictionary. Press OK. Close and restart Spell check to
spell check the current document using the new dictionary.
Set up External program
calling parameters? - Call any program using the same parameters
that would be used in the
Windows Start->Run box or at a command prompt. For Windows, if you
have a registered extension, you can start the associated program
automatically by using 'start [filename]' Some programs may require
rundll [filename]. For instance to open a web
page using the default browser, enter 'start http:\\www.pgdp.net'
(without the quotes). If you are calling a program that has a space in
the path name, you must enclose the program name in double
quotes. IE,
"C:\Program Files\Accessories\wordpad.exe". I have included a few
examples. Click on setup at the bottom of the External menu to see/edit
them. You can also edit the setting.rc file directly if you prefer.
Make a backup copy first though, if you chose to go that route. Changes
made to the external calling parameters will not be visible in the
menus until guiguts is closed and restarted.
There are a few internal variables exposed for use in calling external
modules, if
desired. The exposed variables are:
$d - the directory path to the currently open file
$f - the name of the currently open file (without extension.)
$e - the extension of the currently open file.
In other words, the full canonical name of the open file is $d$f$e.
$i - the (i)mage directory with full path
$p -
the file number corresponding to the (p)age where the cursor is in the
currently open file.
For example you can pass the name of the png file
of the current page to an program using the command:
"C:\some\path\program.exe $i$p.png" - Or, under Windows,
pass the current file to
your default handler "start $d$f$e" (useful to view HTML files) - Note:
if
you try to use any of these variables when they are not set, you will
get errors. IE, trying to use $f before you have opened a file will not
be successful.
Change log history:
Version
.612(487k) Fixed problem where scannospath variable would sometimes
get corrupted in settings.rc file. Order-of-operations error. You would
think that I would have checked that; look at the first line in the
.611 changes note. Apparently not....
Fixed problem with /F .. F/ markup not being correctly handled in HTML
auto generate .
Fixed problem where Remove Blank Lines Before Page Separators
would go into an endless loop at the first separator.
Version
.611(487k) Fixed problem where pngspath variable would sometimes
get corrupted in bin file. Order-of-operations error.
Tweaked autosave indicator a bit. The file may not autosave after
the first autosave interval has expired depending on several factors
even if changes have been made.. It will autosave after each subsequent
autosave interval (assuming changes have been made in the meanwhile.)
Manually save the file once to sync up the autosave function if it
bothers you.
Fixed problem with Font selection dialog not showing up correctly.
Tracked down and fixed problem with Footnote moving code.
Hopefully I've now found and fixed all the things I messed up while
refactoring. I REALLY need to write a test suite to automatically
exercise the script after I've edited it.
Added some code to check if there is a caption for an illustration when
inserting html illustration markup and avoid inserting bogus <p>
mark if there isn't.
Version
.61(487k) Fixed minor problem in Word frequency sort routines
where words that contained an upper case Æ ligature were not
ending up in the expected positions.
Twiddled around with the
middle button auto-scroll function. Added ability to auto-scroll in x
axis as well as y.
Fixed problem with settings save function
where it wasn't properly quoting the $jeebiesmode value.
Fixed problem with the gutcheck view option “Carat
character” not responding to the view selection.
Added a bunch of stuff to improve auto-save. The auto-save timer
will now be reset every time a file is saved or loaded, whether
manually, as
the result of some other operation, as the result of an auto-save,
or, when you right click on the Save icon in the toolbar (the little
floppy disk). The Save icon now changes it's background color to
green if auto-save is enabled. When the auto-save timer is down to
ten seconds from performing an auto-save, the background of the Save
icon will start flashing yellow.
Right click on the flashing icon to reset the timer if you want to skip
a save. Shift
Right-click on the Save icon to toggle auto-save on and
off.
When you save a file now, the current position of the insert
cursor is saved as well in the bin file (as $bookmarks[0]). When the
file is reopened, the cursor and view will automatically return to
the saved position, or the top of the file if no position was saved.
Modified the HTML auto-generate routine to optionally (checkbox)
not convert non iso-8859-1 characters to numeric entities, ie, leave
them as UTF-8. It will also make an attempt to modify the character
encoding in the HTML header to UTF-8. This may fail if you have
customized your header.txt file, so you may need to check that the
charset encoding is set correctly after generation. Note: It is
probably better to leave
most English language texts as iso-8859-1 encoding, even if they
contain a few characters outside of it. The non iso-8859-1 characters
will be encoded as
numeric entities and will work fine (assuming you have a browser/font
which can display those characters, which is a totally separate issue).
This is really only
intended for texts that are all, or mostly non iso-8859-1
characters. (Mostly for DPEU, in other words.)
Modified Selection pop-up to update the selection parameters every
time you modify the selection. Probably more useful that way.
Modified block selection code to select to the end of the line of all
internal lines of the selection if the last line of the selection is
selected to the end. Before, you could not select any further to the
right than the end of the last line of the selection so it was
difficult to select a block on the right side of ragged edge text
unless you artificially padded the end line with spaces. It is possible
that this behavior will be undesired in some instances, if so however,
you can get back the old behavior by putting one extra space on the end
of the last line and selecting up to just before that space.
Rewrote the code for the HTML pop-up window. I shuffled a few of the
buttons around to allow me to factor out some common code. All of the
same functionality with about 150 fewer lines of code. Much
easier to maintain.
Rewrote Table Fx pop-up window. Factored out common code. Removed
about 50 lines. Changes should be completely invisible to end user.
Rewrote settings save routine to be less error prone when making
modifications. The internal layout of the setting.rc file has changed
but it is backward and forward compatible.
Went through entire script, editing to follow better better coding
practices and factoring out common code. Shouldn't affect end user
much, if at all, but makes maintenance easier. Touched probably
well over 1000 lines of code. Tried to exercise all the changes to make
sure I didn't break anything. Likely that I missed something somewhere
though.
Version
.601(487k) Recoded the various sort routines for the Word
Frequency functions using Schwartzian Transforms to cut down on the
processing time. Significantly decreased time to sort the lists for
large data sets. In the process, I fixed the error that version .60 had
if you tried to sort character counts by length. (Which wasn't much use
anyway...)
Added some code to see if your local perl installation has the
Text::LevenshteinXS module available, and uses it if it is, to
calculate
Word Frequency harmonics. I in-lined some code from the pure perl
Text::Levenshtein module but the compiled XS module is much
faster. Speeds up the harmonics functions by several orders of
magnitude. It is recommended that you install Text::LevenshteinXS if at
all possible.
Fixed bug in gutcheck display code where multiples of the same query on
a single line were causing index problems. Worked around problem by
only querying the first instance on a line. (They almost always stem
from word queries on markup anyway.)
Modified HTML link checker to pick up images embedded in CSS styles.
Version
.60(486k) In honor of the version number (.60), guiguts
will now work with .6x versions of Aspell. Many thanks to bgalbrect for
puzzling out the difference between the versions command lines and
submitting patches to get it working. Still backward compatible with
.5x versions too. As of now, there still isn't a generally available
.6x version compiled for Windows (that I am aware of), so Windows users
are kind of stuck with .5x for the time being.
Modified rewrapping routine to ignore <sc> </sc> markup
while rewrapping. Actually, it will ignore all markup enclosed in
<> brackets (As long as there are no spaces in it.) except
<i></i>, for which it will allow one space. (for when it
gets converted to _ _.)
Tweaked the search and Replace histories to store non-Latin-1
characters correctly. Previous changes I had made prevented the
histories from corrupting the setting.rc file but there were still
issues with Unicode > ordinal 255. Hopefully this will resolve them
completely.
Modified sort orders in various windows, (Word Frequency, harmonics,
link check, etc.) to use a "natural sort" where numbers are sorted by
magnitude and words are sorted alphabetically. They used to sort
numbers alphabetically, (well, "ascii-betically") so that, for
instance, numbers would be sorted like: 10, 2, 300, 4, 45. Now those
numbers will be sorted 2, 4, 10, 45, 300.
Added an option to sort by word length (secondary sorts
alphabetically). There was limited room to squeeze it in, so the radio
button labels are a little cryptic. "Alph" means sort alphabetically
(natural sort), "Frq" means sort by word frequency, and "Len" means
sort by word length. Changing the sort order will not automatically
re-sort the list, you'll need to select a sort order, then select a
function to re-sort it.
Modified the first and second harmonics functions to be much less
complex. Sped up the second harmonic function by several orders of
magnitude at the cost of marginally decreasing the speed of the first
harmonic. Both harmonics functions now take about the same amount of
time. Both will now handle Unicode characters much better.
Modified behaviors of various list boxes slightly. They no longer have
a separate indicator for the 'active' and the 'selected' items. (Minor
and probably unnoticeable change.) They also have had their right mouse
click actions changed to occur on button release rather than button press. The right mouse button will
act on the item under the mouse pointer reliably, even if it is not the
currently selected item.
Twiddled around with the Footnote Fixup a bit. Changed the Landing zone
code to be much less user unfriendly. Landing zones positions are now
located just before the footnotes are moved. They are still denoted by
the FOOTNOTES: notation , but there is no underlying significance. You
can now add a landing zone by just typing in FOOTNOTES: (on a line by
itself, with a blank line after.) And you can remove a Landing zone by
simply deleting the FOOTNOTES: marker. The file will be scanned for
landing zones when it is ready to move the footnotes. If you don't have
a valid landing zone for any/all of the footnotes, one will be
automatically inserted at the end of the file to receive the orphan
footnotes.
HTML autoconvert will now convert <tb> to a horizontal rule.
(same as the asterisk thought break).
Version
.593(484k) The search and replace histories were still
causing problems. They didn't handle non-Latin-1 characters very well
and had problems with embedded meta and control characters. After
messing around with a fragile and somewhat bizarre scheme, I realized I
was an idiot and changed the save routine to simply encode all non word
characters as their hexadecimal ordinals. This very neatly side steps
the issue. It requires absolutely no programming changes to the load
routine, removes the necessity to check for wide (multi-byte)
characters on save, and is backward compatible. A win all around.
Did away with something that has been bothering me for quite a while.
The bin files now use the entire base file name as their base name with
.bin appended. Now if you have file.txt your bin file will be
named file.txt.bin not file.bin. This will alleviate the problem of
name space collisions between plain text and Unicode text or html
files. Now you can have file.txt, file.utf and file.html and the bin
files will be named file.txt.bin, file.utf.bin and file.html.bin. I
have no idea why I didn't just do it that way from the beginning. It
seems so much more sensible. Sigh. .593 will attempt to find
file.txt.bin first, then will check for file.bin for backward
compatibility. It will only save bin files with the new name format. If
you want to return to an earlier version of GG you will need to
manually edit the bin file name. (The internal structure hasn't
changed.)
Guiguts will now attempt to make a back-up copy of your bin file every
time you save. Now, if your bin file gets corrupted, you should be able
to recover a lot easier. The file will be the bin file name with .bak
appended - file.txt.bin.bak.
Version
.592(483k) Fixed stupid error on my part that was causing
some saved terms in the search/replace history to corrupt the
setting.rc file.
Version
.591(483k) Added some logic to the status bar update
routine to avoid unacceptable slowdowns in processing time while doing
a gutcheck or tidy check on files that don't have any page numbers set.
Most (all?) DP texts will have page numbers derived from the page
separators, so it shouldn't have been an issue with DP texts. Working
with a text that DIDN'T have the DP page separators would run slower
and slower, till it eventually would grind to a halt, using 100% of the
CPU. Should no longer be a problem. (A text with no page numbers will
still be slightly slower to
process, but it will be a constant delay of 25 ms or so per update
rather than variable depending on how far into the file the cursor is.)
Added some code to the search term entry box for regex searches. Does
continuous checking of the regex term while you are entering it. If it
is a legal term, the text will be black, if it is NOT a legal regex,
the text will turn red. Note: just because the term is LEGAL, it is not
necessarily CORRECT. Also note that there are some regex terms
that while technically correct AND legal, will cause exceptions in the
guiguts regex engine. (Like escaped alphabetic characters that are NOT
a regex assertion, like \h or \y.) By the way, this is nothing new, it
just becomes much more apparent with the continuous regex checking.
Added some logic to track the text box that last had focus, so that
character entered form the Latin-1 or Unicode pop-up menus and Unicode
character entry will be inserted into the last field to have focus
instead of always the main text window. A fairly obvious enhancement,
but one that I was unsure of how to implement. As it turns out, it was
pretty easy due to some other refactoring I had done several versions
back. Hurray for OO methodologies.
Fixed typo in Regex quick reference where the description of the
character class [f-j] erroneously left out g and h.
Added search and replace history drop-down menus to the S & R
dialog. Press on the down arrow to the left of the entry box and the
previous terms will be available to select. Select one and it will be
automatically entered into the entry. By default, 20 terms will be
saved. Duplicates will be condensed. Adjust the size of the history
under the Prefs menu from 1 to 200 terms. (more than 50 or so is
probably not a great idea.) Search and Replace histories will be saved
from session to session. Clear the history by selecting Clear History
from the top line of the history drop-down. All of the replacement term
entries share a common history. I could have made then separate but it
seemed overkill to me.
Tweaked the page separator fixup; Join, Keep Hyphen function to
correctly join a leading emdash to the previous word. This is
very low occurrence, but relatively easy to fix. Note: plain Join Lines
will NOT close up the naked leading emdash.
Version
.59(474k) Modified Auto Save routine to not bother if
there hasn't been anything changed since the last save.
Modified Word Frequency - Accent Check to do some special case checks
for the ligature Æ. Previously it would flag as suspect words
that
had AE but not those with Ae.
Rewrote the Word Frequency harmonic search function to be MUCH more
efficient. Sped it up by an order of magnitude. (~10 x faster than it
was.)
Added a second harmonic function to the word frequency sort options.
Very much like the first harmonic only it will search for words within
a Levenshtein edit distance of two. (The word can be derived from the
root word with two or less edits--add, change or remove characters.)
Again, it only displays words that are present in the open document,
not all possible words. It takes about 1-2 seconds (P4 2.2Ghz) per
letter in the root word to do its search, so long words may take a
while. Be patient.
Rearranged buttons slightly to accommodate the the 2nd harmonic button.
Changed wording slightly to fit into the space available. (Dropped the
word "Check" from several buttons, abbreviated "harmonic" to "harm".)
When doing harmonic checks by doing a Ctrl+left click or arrow up and
down, the harmonic function that is run will be either 1st harmonic if
you haven't run any harmonic functions before or the last harmonic
function that was run by pressing one of the harmonic buttons. In other
words, every implicit harmonic you do, will be of the order of the last
explicit harmonic. If you do a 2nd harmonic by pressing the 2nd Harm
button, then every implicit harmonic you do (Ctrl+left click or arrow
up and down) will be a 2nd harmonic until do do a 1st harmonic by
explicitly pressing the "1st Harm" button. This was actually a bug, or
at least not intended, but I kind of like the effect so I left it that
way. It removes the need to have different hot key combinations for the
different harmonics.
Added the word that the harmonic is being computed for to the harmonic
window header line.
On a trial basis, modified the page separator fixup functions to NOT
remove spaces at the beginning of the line after the separator.
When I wrote the routines originally, it was a problem. With the
separate formatting rounds, it is probably not as much of one. This
should help with not losing formatting around the separators for
pre-formatted blocks, (indexes, poetry, TOCs, etc.)
It has come to my attention that support for some of the Unicode lookup
functions I use in guiguts was disabled in perl 5.8.5, 5.8.6 and
5.8.7. I queried the perl maintainer who made those changes in
the perl source code and got a rather feeble answer that yes, he had
disabled it, no, he didn't have any handy substitute method to do what
I was trying to do, and yes, he would probably re-enable it for perl
5.8.9. Sigh. In the meanwhile, for people who are using those versions
of perl that lack Unicode Block support, I have written a script that
will automatically download all of the latest information from
www.unicode.org and rebuild all the Unicode scripts. The script,
named update_unicore.pl, is included with the distribution. It can be
run at any time to update your perl installation to the latest Unicode
information. (Right now, it is more up-to-date than that included with
the latest perl distribution.) The script will be automatically run if
you try to use functions that need the information and it is missing.
Users of the perl runtime libraries I distributed do not need to run
this script and indeed, should not.
Version
.583(475k) Tracked down a few issues that were causing
the page anchors to be moved to the end of the line under certain
circumstances while auto-generating HTML. Fixed most of them. There are
still a few extremely obscure circumstances that I know of that could
cause it, but they will be trickier to work around. It should be much
better anyway.
Fixed fairly serious bug in my file save/load routines that could
corrupt certain UTF-8 characters if they were the last character on a
line. Order-of-operations problem.
Added a bit more error detecting and reporting code for operations that
use temporary files. Should help diagnose problems easier.
Version
.582(474k) Worked a bit on the status bar Goto functions
a bit to make them a bit more user friendly.
Added a "Goto Label" function to the Label readout to resemble the Goto
Page and Goto Line functions.
Fixed all three Goto functions to avoid the problem where they would
stop working when you tried to go to a non-existent destination.
Changed the Page Label configuration pop up to be activated by a right
mouse click instead of left so it could pop up the Goto dialog on left
mouse click (to make it more consistent with the other two.)
Added a label "Lbl:" to the Label status readout to be consistent with
the other two. ("Ln:" & "Img:")
Modified Page Label status bar to read out "None" rather than stay
blank if there were no label assigned to a page.
Changed the Insert/Overstrike status readout to just be I/O; saves room
on the status bar for more critical information.
Added tool tips to all of the status bar readouts that did not already
have one.
Added a "Normal" mode selection to the Jeebies interface pop up window.
I had Paranoid and Tolerant as the only possible choices.
Version
.581(474k) Fixed problem with image file opening
introduced when I modified page/image tracking functions to work with
alphanumerics.
Version
.58(474k) Modified the page and image tracking functions
to work with page/file names that contain alphabetic characters:
001a.png, 001b.png, etc. It is kind of a hack and may have subtle
issues with blank page handling, but it is pretty close. I only have
one real file to test it on and it seems to work ok for that and it
doesn't seem to have broken functionality for normal files. I'm sure if
something is broken, someone will find it. Note: It probably is not a
good idea to have files that have NO leading digits, though it is
theoretically possible now. There are a few very low use functions
which won't work with alphanumeric page numbers; the original page
renumber function, for instance. That is deprecated and very low use
though, so I didn't feel it necessary to rewrite it.
Fixed problem with Auto end landing zone function in Footnote Fixup.
Under certain (common) circumstances, it would try to access an
undefined variable and get confused.
Messed around some more with the page anchor/number insertion code
trying to reduce bare span errors. Think I made it better. Not sure
without more testing over a wide array of projects.
Modified how guiguts tracks which platform it is running under (which
it checks often during various routines to determine which operations
need to/can not be run.) Instead of doing it locally at each place
where it needs the information, I am doing it once at the the start of
the script and assigning the value to a constant which will then allow
all of the subsequent checks to be optimized away by the compiler,
leading to a smaller memory footprint and faster operation. (By very
small amounts in the grand scheme, to be sure.) The big win is is much
improved maintainability.
Version
.573(473k) Sigh. Accidentally redefined a variable which
prevented header file from being loaded. Deleted bogus line.Should be
ok now.
Version
.572(473k) A few minor tweaks and twiddles, only one of
any import. I have now made the script sensitive to what directory it
is in in relation to what directory it is called from. It should now be
able to find its support files even when started from a different
directory from where it resides. (No longer necessary to cd to the
directory before you run it.) This should have been an obvious change,
but I long ago set up my system to sidestep the issue and it just never
occurred to me that not everyone would, or could, or wanted to.
Version
.571(472k) Apparently, the modifications I made to add
checks for small cap markup, to put it bluntly, didn't work. Went back
in and fixed several stupid errors to get it working and added some
code to do better boundary condition checking.
Modified fixup function to not remove spaces before a full stop if it
is
followed by a digit. Tweaked a few other regexes to run a bit more
efficiently.
Version
.57(472k) Modified guiguts to provide an interface to
jeebies. I am not including jeebies in the guiguts distribution since
it is larger than a whole guiguts distribution including gutcheck.
Either get it from this forum
thread or from the sourceforge page.
(Not current as of this writing.) Guiguts will only interface correctly
with version .12 or above. Run jeebies from the fixup menu just below
gutcheck. Much like gutcheck, the first time you run jeebies, it will
ask you to locate the executable. It will then pop up a window with the
suspect occurances of he & be; each one clickable to jump directly
to the queried phrase.
Updated to the release version of gutcheck .99. If you compile your
own, please recompile to get the release version which does have a few
bug fixes. I inadvertantly released guiguts .561 with a pre release
version gutcheck .99.
Added search for orphan small caps markup to the HTML orphans search.
It isn't really HTML markup, but it follows the rules for it, so it was
easiest to just add it in there.
Removed "-mustexist" directive from the directory chooser for the pngs
directory as it was causing trouble under *nix.
Version
.561(487k) Fixed problem with Aspell dictionaries not
being displayed in the spellcheck options dialog. I broke the
dictionary loading routine when I changed to lexical three argument
opens a few versions back. I didn't realize the syntax for 3 argument
opens was sightly different from 2 argument opens.
Fixed a few problems with grossly oversized fonts (most noticeably on
the proofers pop up). When I changed the font handling code a few
versions back, I missed updating a few spots. I think I've fixed the
rest.
I liked the drag pad on the main window so much that I added it to most
of the pop up windows that might need to be resized. It required that I
make optional scrollbars compulsory, but I think the trade off was
worth it.
Finally noticed that Jim released .99 gutcheck 4 months ago, grabbed a
copy and updated the gutcheck interface to work with the newest
features. I had done quite a bit of it about a year or so ago when I
was working with a pre-release copy, so getting it finalized didn't
take too long.
Now including gutcheck .99 with the distribution. (Bump from .97)
Version
.56(462k) Worked quite a bit on layout and sizing issues.
Fixed quite a few things that will likely be invisible to the average
user, but bothered me.
Finally resolved issue with disappearing status bar if window was sized
less than about 10 lines of text. Now can be reliably sized down
to one line of text without the status bar disappearing. I changed the
way I was tracking window sizes so .56 is (slightly) incompatible with
.551. The only real incompatibility is your saved window size will be
off. (You may need to resize your window the first time you run it.)
Fixed problem with window size jumping when the line numbers were
enabled/disabled.
Added a "drag handle" to the lower right corner of the text window to
make it easier to resize the window. It could be quite fiddly to get
the cursor exactly on the window border to click and drag it. Now there
is a 14 x 14 pixel pad you can use to resize.
Rewrote and generally cleaned up some other code here and there. Not to
add functionality so much as to improve maintainability.
Added a Unicode Character Search button to the tool bar. Just another
way to access the function. Also available under the Help menu.
Fixed problem where HTML auto table was not closing cells correctly.
Version
.551(461k) Fixed minor puzzling error where the Cut and
Copy commands didn't work if invoked from the menu. The keyboard
shortcuts still worked fine so it wasn't a huge issue. Odd,
because they both executed the same code, the menu commands were just
invoking it indirectly. Changed so that both call it explicitly.
Version
.55(461k) Made extensive modifications to the font
handling code. Refactored to use more modern methods. Changes should be
fairly transparent to the users but maintenance is much easier. You may
notice a slight difference in sizes. The new methods store and use the
size number slightly differently. When you first start up, you
may need to reset your font preferences. The guiguts setting.rc
file WILL NOT be backward compatible with previous versions unless you
edit it to remove the $fontsize, $fontweight and $utffontsize
variables. (In which case, they'll reset to defaults.) Note: with
the new font handling code, it is actually possible (and legal!) to
have negative font sizes. Positive sizes are in points, negative sizes
are pixel widths.
Worked quite a bit on the Unicode character search function. It is now
more compact. The letter names, ordinals and blocks are now fixed size
and font. Only the character itself is displayed in a variable
font/size. I made a bunch of enhancements, some of which required some
sacrifices. First the sacrifices:
The list is no longer scrollable with a mouse wheel. An unfortunate but
necessary choice to gain a bunch of other functionality. There is still
a scroll bar, you just need to use it directly.
You can no longer cut and paste the characters directly. Again, an
unfortunate side effect, but one I think I have worked around quite
nicely.
Now the benefits:
Left click on a character to automatically paste it into the text
window at the cursor.
Right click on a character to stuff it into the clipboard buffer, you
can then paste it wherever you want. Very handy for pasting characters
into the search entry box without having to paste it into a document
and then copy it.
Left click on a character description to pop up a window with that
entire character block in it (using the same mechanism as the Unicode
menu.) Even character blocks which AREN'T available through the Unicode
menu are available this way.
Unicode Block coverage below hex FFFF is now complete. Every block is
now available and properly identified in the character
description. I am now using a core module (script really,) to do
the block/character lookups. The FFFF limit is due to a limitation in
Perl/Tk, not Perl. Perl/Tk cannot handle characters with an ordinal
greater than FFFF at this point.
Modified the Unicode character menu pop ups to also support the right
mouse click to stuff the character directly into the clipboard buffer.
Version
.546(459k) Changed some code from in guiprep import
function that was causing trouble under Linux. Removed an option to not
allow you to enter non-existent directory names. Replaced it with a
manual check to see if the supplied directory name exists before trying
to open it.
Fixed problem with HTML auto table function where it was erroneously.
removing spaces around <i> and <b> markup.
Fixed a problem with an unclosed file handle in the character count
function which could cause file saves to fail. Went through the entire
file and converted all of the file handle operations to use lexical
file handles, which should avoid future problems with file handles
staying open beyond their scope.
Version
.545(458k) Messed around with the Unicode Character
Search function some more. There's a saying among Perl programmers;
"First make it work, then make it fast." Yesterday I made it
work. :-) Today, I sped it up about 90%.
Version
.544(458k) Modified header.txt file with new CSS for
sidenotes that will not generate warnings at the w3c CSS validator. It
WAS ok before, but w3c has updated their validator and it was carping
about a missing foreground color attribute.
Added a Unicode Character Search pop up under the Help menu. Ever need
a Unicode character but didn't know which character block it was in or
where to look for it? Now use this handy code point search tool. Say
you need a y with a macron. Now if you work with it often, you may know
that it's in the Latin Extended-B character block. If not, it's trial
and error searching for it. Now, you can pop up this tool, enter "y
macron"
(no quotes, case insensitive) into the Search Characteristics box and
press
Search. It will scan through the Unicode character names looking for
one with those properties. It will quickly find:
Ȳ - LATIN CAPITAL LETTER Y WITH
MACRON - Ordinal 0232
ȳ - LATIN SMALL LETTER Y WITH
MACRON - Ordinal 0233
The actual character, the full name for the character and the hex
ordinal of the character. You can easily cut and paste it.
Want to see if there is a character for a Maltese cross? Try it.
Search Characteristics - Maltese cross
✠ - MALTESE CROSS - Ordinal
2720
Cool huh?
Now the caveats. (You knew there were caveats, didn't you?)
1) You can't use the tool to locate character with ordinals over hex
FFFF. This is more a limitation of Perl/Tk than anything else. Perl/Tk
only understands characters up to FFFF.
2) I chopped out the all of the CJK (Chinese-Japanese-Korean) ideograph
blocks. There's just so darn many of them, it was seriously slowing
down the searches. (It's none too speedy still...)
3) Not as critical, I chopped out the private use block too. It
is pretty unlikely that anybody is going to be using private use glyphs
in a Gutenberg bound text anyway.
By eliminating 2 & 3 I reduced the search space by about 80% (and
thus sped up the search by about 500%.)
While the search is ongoing, the text background will turn gray. On
completion, it will turn white again. If you want to interrupt it, hit
Stop. If you close the window while a search is in progress, you WILL
end up with a bunch of (harmless) warnings in the console window.
The results window uses same font as selected in the Unicode character
block pop up Windows.
Version
.543(456k) Tracked down a bug in the file saving code
where a relatively rare set of circumstances could block the file from
being saved..
Tweaked the Page Anchor HTML code again based the the results of the
testing done in the "Lots of Links" thread in the Post Processing forum.
Version
.542(456k) Worked on the image filename handling code
some more to try to get it to play nicely with jpeg image files. Turns
out I had much more hard coded png extensions than I thought.
Tweaked HTML Page anchor insertion code to not add spurious paragraph
markup if the page break falls within poetry. Still probably not
perfect, but at least it tries
to avoid it now.
Modified Generated HTML Page Anchor code again because Internet Explorer still wasn't liking it. (And my
distaste for Internet Explorer is starting to tip toward complete
disgust.)
Tweaked block markup overrides so that the first line indent will be
repeated for each paragraph inside the block. I had specifically made
this NOT happen at someone elses request several (many) versions back.
I am changing it because it is much easier to add overrides to stop it
than to add an override at each paragraph in the block. (Note: If you
need to change the overrides inside a block quote, you do not need to
end each block separately. And block end encountered ends ALL block
quoting. For example, consider the following very boring passage:
/#[5.3,60]
Text text text text text text text text text text text text text text
text text text text text text text text text text text text text text
text
text text text text text text text text text text text text text text
text text text text text text text text text text text text text text
text text text text text text text text text text text text text text
text text text text text text text text text text text text text text
text text
text text text text text text text text text text text text.
Text text text text text text text text text text text text text text
text text text text text text text text text text text text text text
text text
text text text text text text text text text text text text text text
text text text text text text text text text text text text
text text text text text text text text.
/#[8.48]
Text text text text text text text text text text text text text text
text text text text text text text text text text text text text text
text text text text text text.
#/
Rewrapping will indent the first lines of the first two paragraphs 3
spaces, the remaining lines of the first two paragraphs 5 spaces and
the third paragraph 8 spaces. There is no need to specifically close
both blocks.
Tweaked the wrapping routine to honor non-breaking spaces again. I had
very carefully done this when I wrote the wrapping routine but later
changed it to automatically convert non-breaking spaces to regular
spaces on wrap because there was a burst of projects coming through
with many spaces converted to non-breaking spaces during proofing, and
it was causing lots of support questions about thew wrapping routine
"not working". I think the utility of honoring the non-breaking spaces
outweighs the confusion it might cause though.
Added a non-breaking space to the Latin-1 character pop-up.
Version
.541(454k) Tweaked the arbitrary character highlight
dialog a bit. Added some buttons to automatically select the previous
selection or the entire file.
Revised the image viewer calling code to be compatible with jpeg
images. It was hard coded to expect pngs only. Now it is hard coded to
expect pngs or jpegs only. If we start accepting more formats on the
site, I'll need to make it more general. It still expects pngs and defaults to pngs, it just will accept jpegs now.
Fixed minor problem where if you pasted some text into an empty file
and then saved it as a new file, the file name was not getting updated
in the title bar or the recent file list.
Version
.54(454k) Modified Generated HTML Page Anchor code to be
compatible with Internet Explorer.
Added a new Highlight menu option in addition to the Highlight Single
Quotes and Highlight Double Quotes; Highlight Arbitrary Characters.
Choose whatever character or sequence of characters you would like to
be
highlighted in the selected text. Choose to highlight exact text or
a regex. Exercise a little caution; if you select the entire file and
choose to highlight '.' (any character) in regex mode, be prepared to
wait a while.
Changed the quote (and arbitrary text) highlight color to a light
lavender. It was the same orange as the search highlight.
Version
.538(453k) Fixed problem where Save As would not change
the name of the loaded file to the new file name.
Version
.537(453k) Fixed problem where Save As would not create a
new .bin file if it didn't already exist.
Version
.536(453k) Worked on a few aspects of the file save
routines (for both the text file and bin file) to avoid problems with
read-only directories and/or files. If the directory or file is
read-only, the save routines will attempt to modify the write
permissions to allow you to save the file(s) anyway rather than failing
with error messages. Hopefully this will banish the sporadic save
issues under Windows. Note: Under Linux/Unix, if you aren't the owner
of the directory/file, it still may fail with permission errors.
Fixed problem with out-of-order markup on page anchors that weren't
within paragraphs during HTML auto-generate.
Fixed a few other minor problems that would cause warnings in the
console window during HTML operations.
Version
.535(453k) Well, this one is just embarrassing. Typed the
wrong variable name in the settings save routine which was writing
bogus values for the selected aspell dictionary. When you would try to
run a spell check, it would have NO idea what you were asking it to do
until you manually selected a dictionary to use. Entirely my
fault. .....Well.... I guess the whole thing is entirely my fault
except for a couple of good bits here and there that various people
submitted.... sigh.
Added another non-standard replacement assertion for Search and
Replace: \G .. \E - Greek Transliterate. (Don't expect to find
this anywhere but guiguts, it is extremely non-standard. \G already has
a defined meaning in regexes, but not a useful one under guiguts, so I
am overloading it.) Very useful to do automatic transliteration. Say
you've got things like [Greek: moly/bdides] or [Greek: kekryphalos] or
[Greek: nykteri/des] or [Greek: pharmakon] scattered throughout your
text, (like I do,) and you want to provide a Unicode version. Do a
search: (\[Greek: ((.|\n)+?)\]) and replace: \G$2\E and end up
with μολύβδιδες or κεκρυφαλος or νυκτερίδες
or φαρμακον in one easy operation.
Version
.534(451k) Making one last attempt at solving the
problems with my new save routines before rolling back to the old ones.
Fixed problem with file save if your temp
directory was located on a partition other than the one where your
project directory was. (Not uncommon for Linux/Unix systems.) The
temp file is now saved in the same directory as the original file
rather than system temp directory. Hopefully this will help with the
permissions problem too.
Fixed minor problem with prep file import functions. Now adds a newline
before the page separator. If a file didn't end with a newline, the
page separator was being appended to the last line of the previous
file. Not horrible, but annoying.
Went through the settings save routine and ensured that any setting
that could possibly contain a single quote would escape it on saving.
Version
.533(449k) Fixed problem where file would sporadically
not
save with "Permission
denied" errors, even if you did have write permissions in the
directory.
Fixed a problem with the Save As function where the bin file was not
being saved correctly.
Worked on the Footnote Fixup - Check footnotes function a bit, added
another check, fleshed out the error mode descriptions a bit, made the
header line not disappear while scrolling through the list.
Made Replace All work with a null replacement term.
Fixed a few problems with Tidy check window where it would have
index errors the when it was run more than once in a session.
Fixed problem with Page Label popup where it would not remember
previously set labels under certain circumstances.
Version
.532(449k) Fixed bug in save routine where it was
saving with Unix style line endings, even under Windows.
Modified the the search & replace dialog to allow you to
replace with nothing; (delete). Previously, if the replacement term was
empty, no operation was performed.
Version
.531(449k) Sigh. Bugs galore. Fixed stupid assumption in
Page Label dialog that the image files will be contiguously numbered.
Fixed bug that caused the Page adjust buttons to come up blank under
certain circumstances.
Fixed a few warnings that were showing up in the console.
Version
.53(449k) Reworked new search dialog with multiple
replacement terms to be user configurable whether to show single or
multiple terms. Makes the dialog less cluttered when you don't
need/want them, but they are instantly available if you do.
Completely disconnected the image numbers from the page labels. You can
now edit the page labels without affecting which image is displayed
when "See Image" is selected. Added a new pop up dialog linked
from the HTML Fixup window where the page labels can be easily
customized. You can now easily do page offsets, Roman
or Arabic numbering, restart the numbering arbitrarily, skip
pages in the numbering sequence, just about anything you could want.
(Except compound numbers, e.g. 1-1, 1-2, 1-3, 2-1, etc. I still haven't
figured how to handle those.) Since this is active, I have removed the
redundant "Page Offset" features from the HTML window. To use the
dialog, you must select if a page will be Arabic, Roman numerals, or
the same as the previous page. Then select an action for each page
label, either add 1 to the previous page label, start over from an
arbitrary number, or do not label. The start point must be an Arabic
number, even for Roman labels, e.g. 5, not v. It sounds complex but is
pretty intuitive to use. Once you have your layout arranged, press
"Recalculate" to modify the labels reflect the changes. (It was
processor intensive to continuously monitor and recalculate the labels
so I made it manual.) If the new labels are acceptable, press "Use
These Values". When the HTML is generated, the page label will be
used for the page anchors.
The page label information is saved to the bin file associated with the
text. The page offset information is saved twice, redundantly for now,
to retain backward and forward compatibility. After a few releases,
when pretty much everyone has upgraded, I will probably start to phase
out the older style offset tables.
Rearranged the status bar a little bit. Tweaked a few cells to conserve
space. Added a Page Label readout to the status bar next to the Image
number and See Image cells. It reads out the label that will be used
for HTML generation. If no custom page labels have been configured,
displays "No Label".
Made the HTML automatically insert visible page numbers by default. If
you don't want visible page numbers, put "display: none;" in the CSS
style for the pagenum class.
Worked on display bug where the status bar would be covered up when the
window was reduced below a certain height. Came up with a "fix",
you can reduce the window height to about 3 lines of text and the
status bar remains visible if the line
numbers are off, otherwise, you can only reduce it to about 20
lines of text. Sorry, but that's probably about as good as I can do.
Rewrote Save routine to be a little more robust. The original Save
routine would open the file on the disk, clear the disk file, then
write the file in memory to the disk. This works fine most of the
time, and, in fact, is the way the standard text widgets under Tk do
it. However, during a save operation, after the disk file was cleared
and before the file was completely transferred from memory, if there
was some type of glitch (with guiguts, with the OS, with the hardware,
whatever,) you would end up with a partial file and a funny look on
your face. Not a particularly happy situation. Now the save routine
writes the file in memory completely to a temporary file on the hard
disk, verifies that it is intact, then
renames the temporary file to the original filename. One obvious
drawback to this is that there MUST be at least double the size of a
file in free hard disk space or the file will not be able to be saved.
(Shouldn't be much of an issue with todays hard disks.) The
resultant data integrity is a worthwhile trade off though, in my
opinion.
Twiddled with the Greek transliteration tool a little bit. Tried to
make the y/u <--> upsilon conversion a little more intelligent.
Version
.52(445k) Fixed minor problem in Greek transliteration
tool where a 'gamma
chi' converted to transliteration and back, was being rendered as 'nu
chi' rather than 'gamma chi'.
Made minor change to poetry CSS markup in header.txt file to make long
lines of poetry in narrow browser windows wrap in a more correct manner.
Fixed a problem with indent overrides on blockquotes where the
overrides were basically just being ignored.
Tweaked regex searches with newlines to also match newlines with a dot
(.) character. Only affects searches that explicitly include a newline
character in the search term.
Fixed problem where Control+` shortcut was not working for column
paste. (F3 still was, so it wasn't desperate.)
Fixed obscure bug in wrapping code where standalone zeros would
mysteriously vanish.
Added some import and export routines to import pre-DP text files where
each file is one page named with 3 or digits and a .txt extension
(001.txt, 002.txt, etc.,) and to export a text file by splitting it
into individual page text files named with the page number. This is
mostly in response to several requests by PMs who would like to be
able to use guiguts functions to work on pre-DP text files.
Modified the search dialog to have three separate replacement
terms so you can easily select among a few possibilities when doing
search and replace. (Much like under guiprep, but with regexes!)
The hot keys, ( Control+Enter, Shift+Enter, Control+Shift+Enter) only
work on the top replacement term, all others must use the mouse
buttons.
I may look into having the number of replacement term configurable in
the future but left it alone for now.
Changed status bar readout to display Img. (number) instead of Page
(number). Strictly cosmetic at this point. In anticipation of
coming up with a mechanism to separate tracking of folio page image
numbers from book page numbers.
Changed the directory selection dialog for selecting the image
directory. Used a more standard dialog. Didn't really add any
functionality, but made it easier to:
Added another binding to the See Image status readout. Right click will
now bring up the image directory selection dialog.
Version
.51(443k) Modified the proofer viewing functions to work
with the upcoming four
round modification to the site. It now defaults to displaying the user
name for each "round": Proof 1, Proof 2, Format 1, Format 2. If
you
are working with a file done when there were only the original two
rounds, the Format round user will be listed as <none> and the
counts will be 0 for those rounds. With a bunch more work I could have
made the Format rounds not display if they weren't populated in the
page separators, but I didn't really feel like going through all that
bother for something that is (hopefully) going to be pretty
temporary.
Added buttons to the proofer pop-up to be able to sort on the
additional rounds.
The 4-round changes should be both backward and forward compatible.
Four round files that are processed with pre .51 guiguts will work ok,
they just won't handle the two extra rounds. The Format round user
names/counts will be lost. Two round files processed with .51 or later
will just have zeros & <none> for the format round page
counts and user names.
If you want to play around with a (short, bogus) four round file, you
can find one here.
Guiguts now automatically saves the file and
.bin just before performing HTML autoconvert so you can back out again
encase of trouble. The file will be saved to filename-htmlsave.txt and
filename-htmlsave.bin. You are still encouraged to save the file which
will be used to generate the HTML to a different name from the text
file before autoconversion. This is intended to be a back-out mechanism
for the autogenerate process.
Worked on regex search and replace to try to work around a few
bothersome problems.
1) Not being able to use positive lookarounds in the search term while
doing replacements. This is still not perfect, though much improved. In
general, you can now use positive lookarounds, EXCEPT if they
contain a literal closing parenthesis. In that case, just capture the
parenthesis and add it back in.
2) Not being able to include a literal dollar sign followed by a digit
in the replacement text. You can now use dollar signs in the
replacement text, even when followed by a digit, you just need to
escape the dollar sign with a backslash. You only need to escape the
dollar sign when followed by a digit, though it won't hurt to do it all
of the time.
Modified the proofer viewing functions to work with the upcoming four
round modification to the site. It now defaults to displaying the user
name for each "round": Proof 1, Proof 2, Format 1, Format 2. If
you are working with a file done when there were only the original two
rounds, the Format round user will be listed as <none> and the
counts will be 0 for those rounds. With a bunch more work I could have
made the Format rounds not display if they weren't populated in the
page separators, but I didn't really feel like going through all that
bother for something that is (hopefully) going to be pretty
temporary. Added buttons to the proofer pop-up to be able to sort
on the additional rounds.
Modified the Fixup function to allow selection of either French style
«guillemots» or German style »guillemots«.
Version
.503(440k) Fixed another problem in HTML autogenerate
code. Subscript and superscript code was not being correctly added to
text enclosed in certain markup tags. Order-of-operations problem.
Reworked the middle button autoscroll code a bit. Made the pop up
indicator a little less obtrusive. Changed the cursor while it is
active to give a better indicator that it is. Changed how the scroll
works at low speed. Still can't do a smooth, pixel level scroll, but
the speed is now adjustable down to a very low rate without needing to
reduce the update speed. Since it is no longer really necessary
to adjust the update speed manually, I removed the option from the
Prefs menu again. Apparently, pixel level scrolling is available in Tk
8.5 which will eventually be ported to Perl/Tk 805.xxx. Sigh.
Integrated the many of the various support files into the main script
file to cut down on dependencies. At this point, the script file itself
(either guiguts.pl or winguts.exe) can be dropped into an empty
directory and run without raising any errors. There are still
external support files that are highly recommended, but they are not
absolutely necessary to run the script. (HTML manuals, scannos files,
gutcheck, etc.)
Version
.502(462k) Worked on HTML autogenerate problems for a
bit. Made some changes to try to trap unclosed single line
paragraphs at the start of block quotes. Kind of a kludge, seems
to work,
though I'm not really proud of the code.
Added checks to see if a page anchor is inside a paragraph or not and
to add paragraph markup around it if it is not. The checks add quite a
bit of overhead to the page anchor insertion routine and increases the
processing time by noticeable amounts, (a few seconds,) but should cut
down on the
amount of manual intervention necessary to to get it to validate XHTML
1.0 Strict.
Poked around with the ampersand conversion routins in HTML
autogenerate. Found a few instances where it would not convert them
correctly and re worked them.
Added a new regex to the regex .rc file: '&c(,| |$)' =>
'&c.$1' Look for an abreviation "&c." that doesn't
have it's period and insert one.
Made some more changes to the lazy updates of scanno highlighting. As
soon as highlighting is enabled, it will start to work through the file
adding the highlighting in the background. It is quite processor
intensive so it would be a problem to try to do the file all at once,
but by processing a small chunk at a time, it can work through the
whole file in a reasonable time in the background. The current view is
still continuously updated.
Reworked some of the status bar code to update in real time while doing
drag selections.
Modified status bar code to have a little more stable cell size layout.
Tried to minimize the "dancing widow" effect when the character
descriptions were enabled in the character ordinal readout.
Added another option under the Prefs menu; 'Leave Space After
End-Of-Line Hyphens During Rewrap'. Should be pretty self-explanatory.
While not very useful for English tests, other languages, especially
German, will have fairly common occurrences of "hanging" hyphens. This
option will keep them intact during rewrap. Standard behavior is to
join the final hyphen with the first word on the next line during
rewrap.
Changed the binding for the middle mouse button, which used to be a
very user unfriendly paste routine, to instead activate an autoscroll
routine based heavily on the middle button autoscroll as found in the
Firefox web browser. I use this all the time in Firefox and find myself
trying to use it in other programs too, so I added it here. It doesn't
scroll quite as smoothly as the Firefox version because the text widget
doesn't support pixel level scrolling, (easily) so it has to scroll in
even increments of the height of a line of text. Click the middle
button in the text window to enable it, move the pointer up to scroll
down and down to
scroll up. Press any button or key to disable it again. Middle click
drag still works too. If you press the middle button and drag at all,
the scroll sigil will not pop up. I tried, but was unable to make the
scroll sigil background transparent. In practice, it isn't a big deal,
just position it off to the right of the main body of the text.
Added an option to adjust the update interval for the above scroll
function. It defaults to a 50 millisecond update interval. If this is
too fast for you, you can increase the update interval to slow down the
scrolling. Intervals less than 30 are not recommended to reduce
processing load. Intervals larger than 100 milliseconds will probably
be unpleasantly "choppy".
Version
.501(462k) Fixed up a few problems with the Unicode
character menus, the most annoying of which was that changes to the
font or font size would not propagate through the window. Changes would
not take effect until the Unicode window was closed and reopened.
Rewrote the mouse selection auto scroll code to be more compact and
speedy. Added (pseudo) logarithmic accelerator functions: the
further you drag
the pointer outside the text window (top or bottom) the faster the
window scrolls.
Modified line number updating code to respond to mouse selection auto
scroll better. It now updates in real time, or nearly so.
Tweaked scanno highlighting function to be a bit faster and to do lazy
update of the highlights. Once a word has been highlighted, the
highlighting is not removed (unless the word is edited) until
highlighting is turned off again. This doesn't really make the
highlighting run faster, but it makes it feel faster.
Fixed missing semicolon on " in HTML illustration markup
Fixed problem in Footnote Fixup where Auto Landing Zones wasn't working
correctly.
Twiddled around with italics handling during poetry markup generation.
Should be a bit better. Probably still will have problems with heavily
mixed italic and non-italic poetry.
Added italics detection to the non-rewrap markup autogeneration code.
Same caveats as for poetry. (Not a surprise, it uses the same
algorithm.)
Version
.50(461k) Rewrote Unicode character name handling
routines to use built in perl functions rather than loading a
precompiled hash, seriously reduced size of guiguts package with little
penalty. In fact, it seems to start up a little faster now, though I
haven't really done any benchmarks.
Cleaned out a bunch of no longer need modules that were using memory to
no purpose.
Fixed major bug in Footnote fixup Reindex function that was pretty much
keeping it from running. Fixed several minor problems too, things not
exactly wrong, but sub-optimal.
Added the "arabic" subroutine to the package. (Actually, it was
available in .49 but I forgot to document it. This is mostly only
useful for converting roman numerals to arabic numbers with regex code
assertions. E.G. Search \b([IVXLCDM]+\.) Replace: \C arabic("$1")
\E will convert III. to 3 or MCCCLXVII. to 1367 and so on.
Fixed misspelled codepage conversion menu option.
Changed how the cursor is handled while doing selections with the
mouse. The cursor now follows the mouse pointer. This makes it possible
to do updates of the line numbers while doing mouse selections that
scroll past the top or bottom of the screen. Makes it much easier to
select a block of lines by line number larger than the current screen.
And besides, it just bothered me the the screen would scroll but the
line numbers wouldn't update.
Oh, and removed the darn setting.rc file that crept into the
distribution build somehow.
Version
.49(581k) Updated HTML autogenerate to automatically
convert <sc> .. </sc> markup to CSS <span
class="smcap"> .. </span> markup.
Modified how poetry markup for zero indent lines is handled. Changed it
to use <span class="i0"> rather than a bare <span>. This
will allow easy inclusion of smallcaps or underlined text (or anything
else that relies on span markup) withing poetry without invoking other
side effects.
Modified html header.txt file to contain the new modified poetry style.
Made "Insert anchors at page numbers" selected by default for
HTML autogenerate.
Tweaked
a few other things for autogenerated HTML, mostly cosmetic changes.
Fixed a bug where if you had opened a Unicode character chart and then
tried to use the Greek transliteration tool, (or vice versa,) it would
throw errors and not work correctly. Variable collision.
Added another tool to the Fixup menu: "Convert Windows Codepage
characters to Unicode". Can be run standalone from the Fixup menu, is
automatically run as part of the HTML autoconvert routine. Acts on the
whole file.
Version
.482(579k) Fixed fairly obscure but potentially
destructive problem where poetry with line numbers and combined with
certain indents would delete characters from the beginning of the
numbered lines.
Version
.481(578k) Fixed problem where starting the script would
occasionally throw an error about a missing subroutine. Seems to
be a timing issue. Once the file is loaded into the disk cache, it
seems to load without a problem. Needed to shuffle the loading order a
bit.
Fixed problem where the script was not stripping the BOM from the front
of the file on open. UTF-8 files would end up with an extraneous zero
width no-break space at the beginning.
Modified highlight words function to be a little more forgiving of the
format of the file of words it used as a source file. It used to only
accept files that had a single word per line with no punctuation. Now
you can have multiple words on a line and punctuation not inside a word
is ignored. ( Hyphens and apostrophes inside words and commas inside
numbers will be retained.) Be a little cautious about the size of the
word list. Up to a ten thousand or so words should be ok. Faster
processors can handle several hundred thousand without a problem. An
interesting experiment is to do a word frequency on a text, export the
list (Ctrl+x), then load the list as the highlight list. EVERY word
should be highlighted. (This is how I did testing.)
Modified how the script decides where to insert any non-default poetry
indent classes into the CSS styles. They were winding up in some...
interesting
places. Should be a little better about inserting it in the correct, or
, at least, less nonsensical spot. I can't just insert it in a hard
coded spot, because some people like to customize their header.txt
files and that will throw it off.
Version
.48(577k) Started refactoring to clean up some of the
worst egregious faults in guiguts. Cleaned up most of the globals that
were spread throughout the script. There are still quite a few package
globals, but I don't see how to get rid of those without going to a
completely different, non-compatible method of saving values in the
settings and other peripheral files. Shuffled around a lot of
code, grouped together a lot of initialization stuff together into sub
routines, tried to comment subroutines a bit better, basically did a
lot of code cleanup. Though lots of stuff has changed beneath the hood,
it shouldn't affect operation at all.
Fixed problem where the HTML External Link function was producing HTML
that was just plain wrong. Don't know what I was thinking. I don't use
that particular function much (at all) so I hadn't noticed it before.
Changed the last few self closing anchors to be explicitly closed.
Turns out I missed a couple, (though some were pretty obscure.)
Changed Auto generate HTML to not globally change " to ". I
had done that early on, before my link routines had been worked on much
to prevent other headaches. At this point, my link generation code is
robust enough to not need the hand holding and it has become more of a
headache than a help.
Worked on auto link generation code to be more tolerant of Unicode
characters in the link text. Will now decompose any Unicode Latin
character to its nearest ASCII base character. For non-Latin
characters, it will decompose them to the ASCII name of the characters,
separated by hyphens. This is arguably the wrong way to "translate" the
characters, but since the link name/id can not have any non-ASCII
characters in them, it will yield valid and repeatable results. It kind
of falls down for some of the more complex languages. If the character
is not specifically a digit, letter or ligature, it will be represented
by -X- so ideographic languages may have a problem. (If you've got
kanji, you're on your own.)
Version
.47(572k) Made Footnote Fixup to automatically scroll the
footnotes to the bottom of the visible window on search. Makes for less
scrolling while checking Footnotes/Anchors during First Pass.
Added an option to "Center on Search" that will make the Footnote
search behave the way it did originally, (EG, center the Footnote in
the visible window as you cycle through them.)
Fixed problem with the page number variable passed to external commands
not passing the complete number. Was chopping off the last digit.
Ooops, my regex was a bit greedy.
Removed all instances of self closed anchors from the autogenerated
xhtml files. They were causing problems if the MIME types weren't set
correctly on the server. Also was apparently causing IE to completely
lock up upon occasion.
Kludged up CSS for centered tables and images so broken yet popular
browsers would be forced to render the centered sections correctly.
Modified Header file to avoid enabling "Quirks Mode" in Internet
Explorer. Apparently, Quirks Mode means something like "Render
this any damn way you want.... except correctly".
Version
.463(572k) Fixed problem where if the text window was
sized so that the visible window was not an exact multiple of the text
line height, and you moved the cursor down with the arrow key at the
bottom of the text window, the cursor would skip every other line.
Actually, I tracked it back to a bug in the Tk::Text module. I fixed it
locally and submitted a bug patch to the Tk maintainers.
Looked at trying to stabilize the window size between sessions under
Linux. Opens consistently the same size under Mandrake 10/KDE. Haven't
tested it with other Distros/Window managers. Not sure what else I can
do. I have a feeling this is going to be dependent on how
different window mangers define the window geometry.
Version
.462(569k) Added .u class to CSS and autoconvert
<u> </u> to <span="u"> </span> per request.
Version
.461(569k) Added a Column format modifier to the HTML
table tool so you can set the alignment of columns independently while
automatically generating table markup. There already was a set of radio
buttons where you could select Left, Center or Right aligned text for
the table. This is still there and is used as a default setting.
In general, when laying out tables, it is preferable to be able to
align different columns differently. E.G. names are typically left
aligned, but columns numbers are often right aligned. With the original
selector, EVERY cell in EVERY column got the same alignment and you had
to do a manual alignment change for columns that you wanted to be
different. That isn't too bad if you only have one or two tables, or
even ten or fifteen. I, however, just PPed a text with 352 tables
(yes
I counted) where nearly every table required manually adjusting
alignment after generation, and I got fed up.
There is now a "Column Fmt" entry just below the Auto Table button
where you can specify column alignment. Specify the alignment
with the characters "<" for left aligned, "| for centered text &
">" for right aligned text. Every column should have a corresponding
character telling it how to align the text. Extras are ignored. If you
have more columns than characters, the default setting on the radio
buttons is used for the excess. Every cell in the column gets the same
alignment.
Say you have a eight column table and you want the columns aligned;
going from left to right: left justified, centered, centered, right
justified, left justified, left justified, right justified, centered.
Put the string "<||><<>|" in the entry box and hit Auto
Table. You will probably still have to tweak the header cells a
bit, but in general this takes out a lot of the drudger
Fix minor problem where doing a explicit Block Rewrap was yielding
slightly different results than rewrapping text enclosed in block
markup.
Added an Interrupt button to the Search and Replace Replace All
function. If you had a S&R that was taking a long time there wasn't
any way to break out of it without quitting and losing all your unsaved
edits. Now there is. Similar the the Interrupt Rewrap button.
(Actually, it IS the same as the Interrupt Rewrap button. I generalized
the code to Interrupt Operation.)
Fixed small problem where if an illustration was an illuminated initial
cap letter, the paragraph markup was being mistakenly removed.
Note: they will still probably require some tweaking, but not as
much.
Fixed problem where /F F/ markup was not closing open paragraphs before
where it started.
Also, Check out the new manual for guiguts that dcortesi has put
together. It is much better laid out and user friendly than the default
one, (which has become something of a glorified change log.) It doesn't
have a permanent home right now, but I have put up a redirect page to
it that I will attempt to keep current at http://mywebpages.comcast.net/thundergnat/ggmanual.html
Comments, questions & suggestions are welcome, though please direct
them to the Guiguts Online Manual thread - http://www.pgdp.net/phpBB2/viewtopic.php?t=13808
Version
.46(568k) Lots and lots of small tweaks to make interface
more consistent.
Changed Quote highlighting to be visible over the selection
highlighting.
Worked extensively on Footnote tool to make it less problematic to
convert between inline an out-of-line footnotes. Fixed a bunch of small
bugs.
Added Control+g hotkey to search for next occurrence of last search.
Works very similar to Control+f except focus stays in the text window.
Fixed problem under Linux where script was referencing a no longer
needed/supported package option. (KDE drag & drop)
Modified key insert functions to no longer insert control characters
into the text when an unbound Control+(Key) was pressed.
Fixed Row:Column readout in status bar to read out the correct number
of lines while in block select. (Was off by one.)
Suppressed bogus blank proofer from being displayed in the Display
Proofers window.
Fixed a bunch of operations that are performed behind the
scenes as a series of small steps to Undo as a single step.
Fixed HTML Link check to be a little smarter about checking for links
to local files.
Fixed Word Frequency Character Counts to search correctly for
whitespace characters. ( again... :-/ )
Fixed annoying problem where text would jump to the right if line
numbers were enabled, the text was scrolled left and you inserted the
cursor on a long line.
Worked quite a bit on the Tfx functions to be less user unfriendly.
Insert/Add line will now automatically select the Inserted/Added
line. Selecting a line is much less finicky now. Added a Select
Previous Line / Select Next Line buttons. Will cycle through the lines
left and right respectively. Place the cursor anywhere between two
lines and Select Next or Previous to select the desired line. Does a
lot more "do what I mean" now. Added a column width readout. Displays
the number of spaces in the selected column. (The column to the left of
the selected line.) Makes it easier to rewrap columns to a certain
width if you can see at a glance what the width is. Bound left and
right arrow keys to to select previous and
select next lines, bound Control-Left-Arrow and Control-Right-Arrow
"Move Line Left" and "Move Line Right" functions.
Changed how automatically generated illustration captions are handled.
Now uses a "caption" style sheet markup instead of fixed markup. Fixed
Image code insert function to work if there is no text selected.
Fixed problem where Auto generated html page title was not handling
words with apostrophes correctly.
Version
.456(564k) Changed default HTML encoding in header file.
Apparently the whitewashers would prefer not to have UTF-8
encoded files unless you really, really need them. Ah well. Wasn't
particularly necessary for the guiguts auto generated HTML anyway. I
just put it in there since it seemed like a safe default, but the law
of unintended consequences reared its ugly head. I could do encoding
detection I guess, but it would be rather pointless. The HTML files
guiguts generates are us-ascii. All the time. Every time. No matter
what characters may be in the text file.
Fixed a small problem in the table Tfx function "Convert Step to Grid".
It would choke on table cells that had only one character in them.
Tweaked generated HTML a little more. Trapped a few <p> marks
that were creeping into the /X X/ marked passages.
Modified "Clean up markers" routine to remove /X .. X/ markup. Oops
Worked on Footnote functions a bit. Fixed a bug in the reindex function
that was losing track of anchors under certain conditions. Added code
in the Check footnotes function to check for footnotes that are not in
sequence. Out of sequence footnotes can cause odd errors while
reindexing inline footnotes.
Added option in the footnote functions to convert all of the footnotes
markers to numbers, letters or Roman numerals. Select the appropriate
checkbox and reindex.
Fixed a few oddities in the Greek transliteration tool. Not really
bugs, just things that worked a little unexpectedly. (To me at least.)
Which is close enough to being a bug, I guess.
Version
.455(564k) Reworked end of line removal some more. The
routine I had implemented WAS faster than the previous one, it was
just deceptive because it blocked the program when it was running so it
just
seemed like the program had locked up and made it seem longer. I have
made some changes which
should help it run much faster.
Worked on a few of the HTML markup generation routines (not in auto
generate) to generate correct markup.
Added /x .. x/ markup - Skip. For text version, does basically
the same as /$ .. $/ , for HTML, it does nothing; well, almost nothing.
It adds <pre> </pre> markup around the block, and named
entities, angle brackets, ampersands and quotes will still be
converted.
Added an option to the page marker adjust function where you can insert
persistent page markup into the text.
Version
.454(562k) Removed the Control-` (Column Paste) key
binding completely. Was causing more trouble than it was worth. Column
Paste is still available bound to F3 and through the menu.
Fixed typo in spell check bookmark function that was causing it not to
work. It used to work, I don't know how it got changed. Gremlins
perhaps...
Rewrote Change All function in Spell Check. I was never very happy with
it anyway. Should be able to use now this without adversely affecting
spell check.
Changed the key bindings for the spell check hot keys to to Control key
combinations. The bare hot keys were having unanticipated interactions
with the term entry boxes. The hoy keys are now: "Control-a" - add word
to
Aspell dictionary, "Control-p" - add word to project dictionary,
"Control-s" - skip
word and "Control-i" - skip all occurrences of word (ignore). "Return"
to accept the proposed replacement and search for the next misspelling
is unchanged.
Modified Selection popup to update to the current selection if invoked
while already open.
Modified header file a bit. Added a footnote anchor style - .fnanchor.
Tweaked the footnote markup in general. made a few other small changes.
Version
.453(562k) Worked a great deal on converting HTML auto
generate routine to produce valid XHTML 1.0 strict. Not really
successful for complex texts. In general, it will produce valid XHTML
1.0 transitional, but (especially for texts that have page markers and
sidenotes/line numbers,) there will be some cleanup necessary of page
anchors that aren't inside block elements to make it strict.
Revamped the style
markup for footnotes and sidenotes. The footnote style I blatantly
stole from one of Jon Ingrams projects. The sidenote markup I made up
myself, it may be a bit over the top, but I like it. Image markup now uses
floats instead of the depreciated align markup. Modified image
routine to automatically insert Illustration caption as the image
caption.
Made blockquote mark up selectable between CSS markup or HTML
<blockquote> </blockquote> styles. I have switched this
back and forth several times now and no matter how I have it, I get
grumbling. Fine. Select which style you like and do it that way.
Fixed minor bug in file save routine where it appended a blank line to
the end of the file every time you saved and reopened the file.
Modified search routine to automatically vertically center the found
item in the screen, even if it doesn't technically need to scroll to
get to it. Helps in situations where you need to see what is on
the line AFTER the found term in order to determine what action to take
with it. It would often end up with the found item as the last line in
the window necessitating manual scroll to view the next line.
Added some hot keys for case modifications Ctrl+u - upper case
selection, Ctrl+l - Lower case selection and Ctrl+t - Title case
selection. I sacrificed the transpose function which used to be
attached to Ctrl+t. I don't think it was heavily used anyway. I didn't
assign Sentence case to a hot key, I am running out of keys that can be
assigned as hot keys and I doubt that one is as useful.
Modified the selection popup to be a little more useful. It will not
disappear every time you use it now and will automatically enter the
values for the current selection when it is invoked.
Worked on column paste functions to work better when trying to paste a
block at the end of a line or across a line that is shorter than the
insertion point. A drawback is it tends to leave trailing spaces
around, but it works much more intuitively now.
Added another button to the tool bar: Eol - End of line trailing space
cleanup. With the column cut & paste functions and the table tools,
you'll often end up with odd trailing spaces. This is just a quick
shortcut to the function already available in the menu. I rewrote the
function to be much (MUCH) faster too. I have gotten a lot better with
perl and Tk since I wrote the original. It will operate on a selection
if there is one or the whole file if there isn't.
Modified gutcheck report window delete function to automatically search
for the next item in the list after the deleted item upon deletion.
Modified search routine to vertically center the found item in the
display window.
Modified Word Frequency Harmonics window to allow you to scroll through
the main word frequency list using the up and down arrows. If the
Harmonics window has focus, pressing either the up or down arrow will
move the the next/previous word in the main word frequency window and
run a harmonics check on it.
Added some hot keys to the spell check pop up window. "a" - add word to
Aspell dictionary, "p" - add word to project dictionary, "s" - skip
word, "i" - skip all occurrences of word (ignore) & "Return" -
accept suggested replacement (Change).
Added a Unicode character entry pop up under the Help menu. If you know
the hex or decimal ordinal of a Unicode character you can use this tool
to
insert the character at the cursor.
Version
.452(557k) Added some more functionality to the Table
effects function. It is set up to operate on tables that have multiple
lines per cell and blank line between each row. Many, (most?) tables
have only one line of text in each cell and are packed with no spaces
between rows. You could still use the table tools but you need to
manually add a space between each row and then remove them again after
the adjustment were done. I have added two new buttons Space Out Table
and Compress Table to automate those operations. Space Out Table will
put a blank line after each line in the table. This will allow you to
use the column adjustment tools without incorrectly rewrapping the
columns. Compress table will remove all of the blank lines between rows
again when you are done making adjustments.
I have also added an Auto Columns button. This will try to figure out
the column layout and automatically insert vertical lines between
columns. If your table cells are delimited with vertical bars
"|", it will align on the vertical bars. Otherwise, it
assumes multiple consecutive spaces delimit the columns and
will insert vertical bars and align them. If you accidentally get
extra cells, delete the vertical bar just before the cell(s) in error
and re run Auto Columns.
Changed Alt selection to Shift selection. Using the Alt keys carries a
bunch of other baggage along with it that was becoming increasingly
hard to work around as I added functions to work with the block
selection.
Worked on expanding the functions that work correctly with the
Shift-select selection boxes. Now the Case modifiers, Surround
Selection, Flood Fill, Convert To/From Named Entities and
Convert Fractions work correctly with non
contiguous selection blocks.
Eliminated annoying flicker during Shift-selection drag operations.
Worked on making Shift-select Block selection work seamlessly with
cut-'n-paste. Almost, but not quite; don't know that I can make it
better
though. You can now use Cut and Copy, (Ctrl+x & Ctrl+C) with either
selection method and it will do a normal or column (block) operation
depending on the selection made. Normal cut/copy for normal selections
and column cut/column copy if there is a block selection made. It is
impossible to automatically determine if you want a normal paste or
column paste in a modeless operation, so you must specify normal paste,
(Ctrl+v) or column paste, (Ctrl-`). Note: the original hot keys
for column cut, copy and paste are still active, F1, F2 & F3
respectively. When doing column cut and paste operations, you will most
likely be happier with the results if you change from Insert mode to
Overstrike mode.
Added a Selection block vertex readout to the status bar. Shows the
start point and end point of the current selection. Possibly useful
during column cut and past operations. Added some more functionality.
If you are doing block selections, single click on the selection box in
the status bar and it will read out the size of the selection box. (If
you are doing normal selection, it will not change as the size of the
selection box is less useful there.) If you have made a previous
selection and right click on the selection status box, the previously
selected text will be selected again. If you double left click on the
box, a selection dialog will pop up where you can enter a start and end
point for the selection and then select it.
Reduced the size of some of the items on the status bar, it was
starting to get out of hand.
Tweaked the default English common highlighting word list. Added some
more words.
Fixed problem with highlighting function where the first time you
invoked it, you could not cancel out of choosing a word list.
Fixed minor problem in Word Frequency, character counts, where you
couldn't search for zero. Thought I fixed this before; I must have
broken it again at some point.
Started working on undo functions to make actions that are performed
with a single operation be undone with a single undo. I had tried to do
this before and mostly succeeded in making the undo functions
almost useless. I think I have figured out the proper methods to
manipulate the undo buffer without corrupting it now. There are some
operations that, while they can
be undone, probably should not
be undone. Namely, rewrap operations. At this point, if you undo a
rewrap operation, any page markers that are in the middle of a
paragraph will be shifted to the end of the paragraph if rewrap is
undone. I am looking into ways around this but it may take me a while
to figure out a solution. For now, avoid undoing rewrap if at all
possible. If you rewrap to wrong margins, adjust your margins and
rewrap again. If you accidentally rewrap a table or index or some
such, you may be forced to undo, but you'll need to manually reset the
page markers where they belong, (assuming they were in the middle of
paragraphs.)
Worked on the link checker yet some more. Now it is nearly bullet
proof. The only fragile part is if you get the directory wrong in
the first image link, all of your images will be reported as not found.
Fixed spelling error in Greek transliteration tool; lamda -> lambda
Version
.451(550k) Fixed stupid error in Greek transliteration
tool where the ou ligature was mistakenly listed as an au ligature and
inserting au. I have no excuse for that. That is just embarrassing.
Fixed problem with inserting Unicode encoded files into an open file
not decoding the first line properly. (Or inserting it in the proper
place either...)
Reworked link checking code quite a bit. Still looks essentially the
same to user,
but the underlying parsing code is much more tolerant of variation in
the layout of the anchor. Incidentally, made checking code much more
efficient too, though thats not a real big win here. Running in .1
seconds instead of .25 seconds doesn't mean a lot in the grand scheme
of things if said function only runs once or twice a session.
Tweaked external link warning code to differentiate between externally
linked local files and remote links. Will now check to see if
externally linked local files exist and warn if they can not be found.
(Will only check files located in or below the directory the main file
is in. Files in other paths are assumed to be an error and are not
checked for existence.)
Added new function: Automatic word highlighting. Primarily geared
toward automatic stealth scanno highlighting, it can be easily
customized to highlight any word list you chose. Word lists
must be a plain text file with one word per line, and nothing else. A
default word list is included with the script in the new
guiguts/wordlist directory. It is essentially the
English common scannos list reformatted with one word per line and all
of the extremely common variations removed (and a few other word
added). Left click on the little H
in the status bar. The first time it runs, it will ask which word list
to use. Browse to it and open the desired word list. The H status box
will change to the highlight color and any of the words from the word
list file located in the text will be highlighted. The highlight
updating occurs every 3/4 to 1 second, so if the status bar H is
highlighted, but there are no words highlighted in the text, scroll
around a bit. Left click on the H again to disable highlighting. The
box will return to a gray background. Once you select a word list, it
will continue to be used until you change it or restart guiguts. To
change the word list, Right click on the H in the status bar. The same
dialog will pop up asking which file to use. If you don't care for the
default highlight color, you can change it under the prefs menu. The
highlight color will be retained from session to session. Words in the
word list can not contain any punctuation except apostrophe. They can,
however contain any legal Unicode alpha-numeric character below ordinal
FE00. Words are case sensitive.
Note: I am open to suggestions for words to be included in the default
English list. (Though I may choose to ignore them.) Due to the way I am
processing the list, there is little penalty for having a fairly large
one, though more than a thousand words would get unwieldy. Too many
words would reduce the effectiveness too by overloading the screen with
highlighted words.
An interesting possible use for the word highlighting function. Open a
project and go to the Word Frequency function. Do a Spell check. Press
Control+x to export the word list from the word frequency window. You
can now open the wordlist.txt file in the highlighting function to do
automatic highlighting of words not recognized by the spell
checker while you scroll through the text!
Version
.45(546k) Twiddled around with HTML auto generate code on
non rewrap marked sections to try to get rid of italics markup it
persisted in inserting in error. Think I'm finally getting close.
Added a "Flood Fill" tool, sort of based on vitalogys suggestion.
Select a section of text and fill it with a specific character or
string. (By default space) There is a pop up window under the selection
menu where you can edit the fill string and activate the function. Also
available directly as a hot key; Control+w. (It's not really mnemonic
for anything, I just prefer hot keys on the left side of the keyboard
and I'm running out of available keys.)
Fixed incorrect label on Gutcheck -t option. I was improperly
labeled as the opposite of what it actually is. Oops. I never noticed
because I always run in paranoid mode which automatically enables all
checks.
Added a "book mark" function to the spell check. If you are partway
through spell checking a document, you can set a "book mark" which will
allow you to come back to the same spot later and pick up where you
left off. (Words which have been "Skip All"ed will continue to be
skipped.) You don't NEED to start where you left off, by default it
will start over at the beginning each time you restart. You can choose
to set a book mark and return to it later. You can even set a book
mark, then restart spell checking at the beginning and then start again
the the book mark. Once you set a "book mark",
it will remain until you set another.
Fixed a small problem with the tidy error check where it would have
problems with going to the errors if you ran it several times in a row.
Putzed around with HTML auto generate, making the code it generates a
little more XHTML compliant. Not completely there, but it's getting
there.
Added an alternate mouse selection method. If you hold down the Alt key
while selecting with the mouse, the selection will be contained to
within the rectangle defined by the anchor point vertex and the current
pointer position. This isn't MUCH use yet, as all of the
selection mode functions need to be converted to be able to use non
contiguous selection segments. Right now though, if you have a block
selected, and you hold down the Alt key and press one of the arrow
keys, the block will move in that direction. (Note: don't try to move a
selected block up or down through a line that isn't filled past
the block. It won't do what you want.) Right now, this is best used
with the below table tool to adjust the vertical alignment of header
cells i a table after it has been rewrapped.
Add a whole new tool section for ASCII table munging.
When post processing table heavy texts, it is very common to get tables
which have been bizarrely and inconsistently formatted by the proofers.
Especially when you get multi page tables in which consecutive pages
were done by different proofers. You could literally spend hours
carefully and laboriously reformatting them, trying to fit them into
the 75 space maximum allowed by PG (80 space absolute max). It was
enough to make many people avoid texts with lots of tables like poison.
Well,
help has arrived.
Guiguts now has a specially purpose ASCII Table Special Effects
tool (Tfx in the tool bar). If you have a table that has vertical
bar separators,
you can now move the separators left and right, automatically
rewrapping and maintaining cell layout in the columns. Your table
doesn't have vertical bar separators? Not to worry. There are tools to
easily automatically add, remove and relocate them. Your table is too
wide to fit within 75 spaces in a grid layout no matter how narrow you
make your columns? Automatically reformat your grid format table
as a stepped column table, which will typically allow at least 3 times
the nominal width and still fit in 75 spaces. Don't like your stepped
format table? Automatically reformat it as a grid layout table with 1
button press. To adjust the spacing of columns in a grid
layout table, there needs to be a bar on both sides of the column being
adjusted. Select the bar to the RIGHT of the column. (To select a bar,
highlight any ONE segment of it and press Vertical Line Select.) Select
whether to
automatically rewrap the column as you move the bar and whether to
left, center or right justify the column. Move the bar left or right,
the text in the columns will automatically adjust to follow the bar
depending on your settings.
The toolset doesn't do absolutely everything, but it probably takes
about
85-90% of the effort out of table reformatting.
Say you have a table that looks like this: (An actual table from a book
I am working on)
|
|
| |AMOUNT
OF| |
|
|
| | WATER |
AMOUNT OF |
|
|
| | NEEDED
| SUGAR |
|
|CHARACTER OF | HOW TO | FOR | NEEDED FOR
|
|KIND OF FRUIT | FRUIT | PREPARE |
COOKING | JELLYING |
|
|
|
|
| |
|APPLES, SOUR |Excellent for|Wash,
|Include |¾ cupful of |
|
|jelly making |discard |One-half |sugar to 1 |
|
|
|any |as much |cupful
of |
|
|
|unsound |water as |juice |
|
|
|portions,|fruit
| |
|
|
|cut into |
| |
|
|
|small
|
| |
|
|
|pieces. |
| |
|
|
|
|
| |
|APRICOTS |Not suitable |Leave a
|For jam |¾ cupful of |
|
|for jelly |few |use
just |sugar to 1 |
|
|making. |stones in|enough
|cupful of |
|
|Excellent for|for |water to
|apricots |
|
|jam. |flavor.
|keep from|for jam |
|
|
| |burning
| |
|
|
|
|
| |
|BLACKBERRIES |Excellent for|Wash |1
cupful |¾ cupful of |
|
|jelly making | |of
water |sugar to 1 |
|
|
| |to
5 |cupful of |
|
|
| |quarts
of|juice |
|
|
| |berries
| |
|
|
|
|
| |
|BLUEBERRIES |Excellent for|Wash |1
cupful |1 cupful of |
|
|jelly making;| |of
water |sugar to 1 |
|
|make a sweet | |to
5 |cupful of |
|
|jelly
| |quarts
of|juice |
|
|
| |berries
| |
|
|
|
|
| |
|CRANBERRIES |Excellent for|Wash
|One-half |¾ cupful of |
|
|jelly making | |as
much |sugar to 1 |
|
|
| |water as |cupful
of |
|
|
| |berries
|juice |
|
|
|
|
| |
|CHERRIES |Pectin must |Pit
the |For jam, |¾ cupful of |
|
|be added for |cherries |use just |sugar to 1 |
|
|jelly making |for jam |enough |cupful of
|
|
|
| |water to |cherries
for|
|
|
| |keep
from|jam |
|
|
| |burning
| |
|
|
|
|
| |
|CRAB APPLES |Excellent for|Same as |One-half
|¾ cupful of |
|
|jelly making |apples |as much |sugar to 1 |
|
|
| |water as |cupful
of |
|
|
| |apples
|juice |
With a few button presses you can make it look like this: (adjustable
wrap margin; set to 50 here.)
KIND OF FRUIT
|CHARACTER OF FRUIT
| |HOW TO PREPARE
| | |AMOUNT OF
WATER NEEDED FOR COOKING
| |
| |AMOUNT OF SUGAR NEEDED FOR
| |
| |JELLYING
| |
| |
APPLES, SOUR
|Excellent for jelly making
| |Wash, discard any unsound
portions, cut
| |into small pieces.
| | |Include
One-half as much water as
| | |fruit
| |
| |¾ cupful of sugar to 1 cupful
| |
| |of juice
| |
| |
APRICOTS
|Not suitable for jelly making. Excellent for
|jam.
| |Leave a few stones in for
flavor.
| | |For jam use
just enough water to
| | |keep from
burning
| |
| |¾ cupful of sugar to 1 cupful
| |
| |of apricots for jam
| |
| |
BLACKBERRIES
|Excellent for jelly making
| |Wash
| | |1 cupful of
water to 5 quarts of
| | |berries
| |
| |¾ cupful of sugar to 1 cupful
| |
| |of juice
| |
| |
BLUEBERRIES
|Excellent for jelly making; make a sweet jelly
| |Wash
| | |1 cupful of
water to 5 quarts of
| | |berries
| |
| |1 cupful of sugar to 1 cupful
| |
| |of juice
| |
| |
CRANBERRIES
|Excellent for jelly making
| |Wash
| | |One-half as
much water as berries
| |
| |¾ cupful of sugar to 1 cupful
| |
| |of juice
| |
| |
CHERRIES
|Pectin must be added for jelly making
| |Pit the cherries for jam
| | |For jam,
use just enough water to
| | |keep from
burning
| |
| |¾ cupful of sugar to 1 cupful
| |
| |of cherries for jam
| |
| |
CRAB APPLES
|Excellent for jelly making
| |Same as apples
| | |One-half as
much water as apples
| |
| |¾ cupful of sugar to 1 cupful
| |
| |of juice
And with a few more button presses, you can change it back.
Now, for this table the step format isn't really necessary, but if your
cells contain a LOT of text, the step format can be MUCH more efficient.
There are some things it doesn't deal with well. Cells that span
multiple columns may (probably will) be problematic. You may need to
reformat tables that have them, in sections. You will probably need to
clean up some extra cell division bars at the bottom of converted
tables. No big deal, just select and delete. If your table doesn't have
a right edge bar, the routines may leave some spaces at the end of the
line, again, there are already methods to fix that available.
I am quite pleased with the table tools. The table conversion tools,
though impressive were relatively easy to program. That was mostly just
a matter of reading the text in to a matrix, doing a transform and
rewrap and writing it back out again. The grid column adjusting tool
was surprisingly complex to write. Every time the column moves a space,
it needs to do four table matrix transforms and a rewrap operation on
each cell in the column while keeping track of individual cell heights
in relation to other cells on the same row and adjusting so they don't
collide when
being written back out.
Version
.446(537k) And arggh again. Should have let .445 bake a
bit longer.
Fixed problem with bin file saving code where it was adding a
extraneous ');' and causing the file to unparsable by guiguts.
This would cause you to lose your page markers if you closed the file
and them reopened it. Not good.
Fixed problems with Auto Save and Auto Backup functions. Now they can
be enabled and disabled properly and settings are retained session to
session.
Added a custom paste routine. I call it "Overpaste". If your
Insert/Overstrike mode is in Overstrike, the paste function will
overwrite text as you paste it in. I found this very handy when I
was reformatting tables in ASCII. Using this function, I could set up
an empty framework, and then cut and paste the text into the frame
without disturbing the layout. (The text I was pasting in would
overwrite the spaces.) I cut the time I was spending on reformatting
tables by probably at least 30-40% since I didn't need to
constantly delete spaces after I pasted in text. The function is
intelligent enough to check if it is near the end of a line and not
overwrite the newline and start of the next line if the pasted text is
longer than the text between the insertion point and the end of the
line. In that case, it will overwrite characters until the end of the
line and then just append the remaining text.
Fixed a few more problems with HTML auto generate with italics markup
in poetry sometimes going awry.
Fixed incorrect title text in Browser Start Command dialog window.
Version
.445(531k) Arrgghhh! It helps if I actually decode the
correct variable. Fixed problem where first line of text in a UTF-8
encoded file was not being decoded properly if there were any multi
byte characters on the line.
Added Auto save function. Toggle it on and off under Prefs menu. If
enabled, will automatically save every interval. Also added Auto
save Interval Adjust function under Prefs menu, pick how many
minutes between automatic saves if auto saves is enabled. Set to 5
minutes by default, can be any integer from 1-999. If set to zero ,
will revert to default, (5 minutes).
Now will check for and warn when you attempt to Save As to a file
name that will have a collision with its .bin file. In other words, if
you have a file myfile.txt and want to generate an HTML version, you
will typically save a copy and work on that. If you save it
as myfile.html , there will be a collision of .bin files, since it just
takes the part before the extension and changes to a .bin extension to
name the bin. Will now warn you if you attempt to do this.
Fixed problem with Goto Page and Goto Line dialogs where they would
occasionally refuse to close.
Added Auto Backups function. If enabled, will automatically save
copies of the two most recent editions of the file. Assuming a file
myfile.txt. When you save the file, if it is already saved to disk, the
copy on disk will be renamed to myfile.txt.bk1 and the current file
saved to myfile.txt. If you save it again, myfile.txt.bk1 will be
renamed myfile.txt.bk2, myfile.txt will be renamed to myfile.txt.bk1,
and the current file will replace myfile.txt. Any further saves will
result in myfile.txt.bk2 being deleted, and the same shuffle as
previously described for the remaining files. This function may be
enabled or disabled under the Prefs menu.
Version
.444(531k) Applied fix I had made to /P P/ poetry italics
markup detection, to the /* */ block markup detection as well. It had
the same problem of getting confused by markup that was inside the line.
Modified guiguts to add the BOM \x{FEFF} marker at the beginning
of files that are saved in a multi-byte format. Should have been doing
that from the beginning. :-( It is astonishing how much I don't
know about Unicode files and formats. Especially that I'm allegedly
developing a Unicode text editor. Should be more compatible now with
other text editors. (Theoretically.) Thanks to garweyne for pointing
this out.
Fixed badly implemented bug fix to convert fractions routine. Now
actually works again.
Fixed problem with setting file code where it would choke on file names
with and apostrophe in the path/name. Would cause settings to revert to
defaults for no apparent reason. Apostrophe is a legal character
in file names, but was causing the settings file loading code to
get confused. Rather tortuous and twisted method for fixing that, but
it works at leasts. (On my system, I'm sure some other bug will
manifest once it is in the wild. :-/)
Modified the auto table defaults slightly. No longer has borders set by
default. I found that 95% of the time I was setting them to '0'
anyway. Increased cell padding a bit too. Another setting I was
constantly modifying. Just because I like it better that way.
Added another custom assertion to the regex replacement parser. \C..\E
- C is a mnemonic for "code". It will parse everything between the \C
and \E as perl code, execute it and return whatever the results of the
execution were. I found this useful when I needed to add an offset to a
series of integers. When I autogenerated my HTML for a file, I had my
page markers off by two, they were in the correct positions, just
numbered incorrectly. I had done quite a bit of other tweaking to the
file before I noticed, so was loathe to abandon the work I had already
done and re generate the file. Using this regex assertion, I was able
to search for the page anchors and increment them by 2 in about 5
seconds, (Well... not counting the 30 minutes it took me to write and
debug the code assertion...).
Search for:
<a name='Page_(\d+)
replace with:
<a name='Page_\C$1+2\E
Very useful. You could also use this for more complex calculations too.
Say you wanted to find all real numbers in a file(positive or negative,
integer or floating point) and find the natural logarithm of the square
of the number to 4 decimal places. (Why? I don't know, it's a
hypothetical scenario, bear with me.)
Search for:
(?<![\.\d])([-\.]{0,2}\d(\d*)?\.?\d*)
<-- Find a real number
Replace with:
Natural log of $1 squared is \C sprintf("%.4f", log($1*$1))
\E <-- Replace with calculated
value
So 255 would return "Natural log of 255 squared is 11.0825"
and -2.56 would return "Natural log of -2.56 squared is 1.8800." and so
on.
Note: If the operation yields an error or an undefined value, you will
get an error message in the console window and a blank return value for
the calculation.
This assertion should be used cautiously. It will execute nearly any
valid perl code it is fed. If you set it with code to delete all the
files on your hard drive and run it, it will cheerfully try to do so.
As a bit of a safety, if you try to run regex code assertions, it will
pop up a warning message asking if you really want to do so. If you do
not know what the code does and/or do not trust the source of the regex
term, click Cancel to escape without executing. Click OK to continue.
You can also select Warnings Off if you like. It is sort of an "Expert
Mode" and assumes you know what you are trying to do and take
responsibility for the consequences.
Don't panic and be afraid to use this assertion; if you don't tell it
to delete files, it is not going to. I just want to impress upon you
that if you DO tell it to delete files, it WILL. Use judgment
when executing arbitrary code. (And no, I'm NOT going to give you
an example of code to delete files, if you really want to know, read up
about perl and figure it out yourself, I am NOT going to help you shoot
yourself in the foot.)
Added a pop up Regex Quick Reference under the help menu. Actually
loads
a text file in the guiguts directory named regref.txt.
Contains a summary of the assertions that are legal for use in the
regex engine as implemented in guiguts. You can edit the file if you
want too. It can contain anything you want. A likely use would be
a scratch pad to hold useful regexes that don't really belong in the
scanno regex file. (Like the above custom regex to find a
real number, for instance.)
Changed Goto Page Number and Goto Page number dialog to highlight
the value in the entry box when it pops up so that if you type
something, it automatically deletes the value already there.
Simple to do in the Goto Page, required overriding default methods and
rewriting of code for the Goto Line dialog. A bit of a pain for a
minor gain, but the little things like that can be a major irritation
after a while.
Version
.443(524k) Fixed bug where margin prefs were being
modified unintentionally by some program operations.
Fixed bug in See Page code where if you had more than one page marker
adjacent to each other, the See page would display the image associated
with the FIRST marker in the series not the LAST. (Typically only
happened if you had blank page in the text.
Fixed oddity in HTML auto generate code where adjacent page markers
would be inserted in random order.
Fixed bug in Internal Link pop up window where See Page Anchors check
box was not retaining state.
Fixed potential problem in Convert Fractions routine where it would
mistaken partially convert some fractions. I.E. 1/25 would mistakenly
be converted to ½5 (one-half 5) Fixed now.
Fixed a problem where some arrays were not getting cleared
when you loaded a project. If you were working on a project that had
footnotes and autogenerated HTML; then opened another different project
with MORE footnotes, without restarting guiguts, and autogenerated HTML
on the second project, it would freeze while generating the HTML.
Worked on trying to make Gutcheck view options sticky from run to
run. They are (and were) if you don't close the gutcheck window
between runs. Made some modifications that will improve the situation
but am not easily able to completely overcome the problem without major
redesign of the view option code. (And I don't think it is serious
enough a problem to invest major amount of time in.)
Worked on HTML autogenerate code for poetry. The way guiguts handles
poetry is to enclose each line in a <span></span> block,
however, if there are italicised portions of a poem that span lines, it
was occasionally having trouble. Italics markup is not allowed by the
HTML spec to be improperly nested with block elements. so something
like
this is illegal:
<span><i>A line of poetry that is italicized</span>
<span>and here is another.</i></span>
The <i> markup can't be continued across a <span>. To
compensate for this, guiguts was detecting whether there was
unclosed <i> markup on a line and then inserting open and
closing markup on each line until it got to a line that had an unopened
</i>. That all worked, but it would get confused if there
was other separate italics markup on either the starting or
ending line.
I've reworked the markup detection logic and made it a lot more
resistant to getting confused by those types of situation.
Version
.442(523k) Oops, left some debugging code active in the
HTML autoconvert routine. Wasn't hurting anything, but it slowed down
the routine by a significant amount.
Worked on link checker some more. Now only reports image links if there
is a problem. Now checks that all the files in the image directory are
used and reports any unused files. (Note: this will give some false
positives if the images are not in their own directory. It is highly
recommended to use an "images" directory if your file contains any
images, even if it is just one or two.)
Modified built in link checker to work correctly with "tidied" files.
It
was fragile in that it expected to find each link or anchor completely
on one line. Tidy often breaks long links and anchors across lines.
This would lead to spurious errors being reported by the link checker.
Fixed now.
Figured out how to stop tidy from echoing to the console, just need to
redirect STDOUT to null. I didn't think of that before because I need
to capture the output to get the error list. I didn't realize at first
that the error list is sent to STDERR, not STDOUT so redirecting
STDOUT works fine. Sometimes it really does pay to read the
documentation... :-/ Tidy error check runs much faster now with the
echo suppressed.
Found an fixed long time bug in the HTML page anchor generation
code where it would sometimes inexplicably skip adding some
anchors if you had an page offset set.
Added another option to the HTML autogenerate window:
Convert Fractions. If selected, will automatically convert all
written out fractions in the text to Named/Numeric
entities while autogenerating HTML. I.E. 1-1/2 will
become 1½, 5/8
will become ⅝ and so on, for all of the available Latin 1 and
Unicode fractions. (halfs, quarters, fourths, fifths, sixths and
eighths.) Same function is available under the Selection menu in the
main window . If you select some text and press Convert Fractions, all
of the fractions inside the selection will be converted. If you
DON'T make a selection it will work on the whole document.
Version
.441(521k) Modified search window to avoid weird growing
problem
under Linux.
Modified HTML header file load function to convert to native line
breaks on load. Should alleviate troubles under Linux while
generating HTML version.
Fixed yet another buglet in overridden margin block wrapping code.
Tweaked Auto List code a bit to generate more useful code that should
require less hand tuning afterwards. Fixed up unclosed paragraph markup
for the previous line.
Added "t" hot key to the page separator fixup function, leave two blank
lines where separator was. Similar to "l" for one blank line or
"h" for four. Very useful while PPing cookbooks with 2 blank
lines between recipes. (Wonder what I've been post processing? ;-) )
Modified named anchor routine to handle named entities &
" and — more gracefully.
Modified "make anchor" replacement assertion \A..\E to not
automatically add the original text back in. It is now necessary to
explicitly add it back in. That allows adding tags around the displayed
text in a single search and replace operation. For example: Say
you want to find all occurrences of "CHAPTER" followed by a roman
numeral, make an anchor and make the displayed text bold. With
the old method, there was no way to do it in one step. With the
modified assertion, you can do a regex search for:
(CHAPTER )\s*([IVXLC]+)
and replace with
\A$1$2\E<b>$1$2</b>
For "CHAPTER XVI" you would get: <a
name='CHAPTER_XVI'></a><b>CHAPTER XVI</b>
This example is rather trivial, but the \A..\E assertion will correctly
deal with punctuation, named entities, spaces and such automatically.
Modified built in link checker to display all of the critical things
first; (internal links without anchor, external links, links with
illegal characters) before displaying informational stuff. (anchors
without a link.)
HTML Link checker now does more comprehensive image link verification.
Specifically warns if an image link name contains upper case
characters. (Not allowed under the PG spec.) Specifically warns
if an image file can not be found.
Version
.44(519k) Modified search routine to work correctly
under Linux when searching across line breaks. Modified routine to be
more robust and work seamlessly across platforms.
Fixed search and replace window to not delete text after the cursor in
the search term or replacement term entry boxes if you use the enter
key
to do searches.
Modified search and replace entry boxes to be adjustable width
instead of
just 40 characters. Box is resizable in the X direction if necessary.
Very handy when you are working with very long regex expressions. Quite
pleased with getting this to work.Trickier to pull off than it appears.
Modified the various list windows (word frequency, gutcheck, link
check, etc.) to standardize on double left click as the primary
function, (search) and right click as the secondary function, (varies
by window). I can't use single left click because single left
click is already used for select. It will be a little confusing for a
while, but in the long run will be more user friendly.
Modified setting file loading code. Before, if it encountered an
error or something it couldn't parse while loading the settings, it
would revert to defaults and then overwrite the file containing
the error with the default settings. Made it very difficult to
troubleshoot when someone had a problem with the setting file. Happens
rarely, but it happens. Now, if it has a problem, it will write out the
original setting file to a file called setting.err before it overwrites
with defaults. That way you (I) can at least try to figure out
what the problem was and probably recover the settings.
Made some additions to the HTML image window. Now allows you to
maintain the aspect ratio automatically while changing the display size
of images. Also displays actual size of image for reference. Slightly
buggy in that if you highlight and delete either the width or height
value, it will not start to calculate the other until you have at least
two digits in the box you are modifying. (Done to prevent
divide-by-zero errors. In practice, shouldn't be much of a problem.)
Fixed yet another minor bug in the rewrapping code. If you overrode the
block wrap margins, the first paragraph inside the block markup would
use the standard rewrap margins instead of the overrides.
Added a "Multi line" option to both the Auto Table and Auto List HTML
markup generators. In the auto table function, if multi line "ML" is
checked, it will treat each paragraph of text as a single row, grouping
everything that aligns vertically into a single cell. (In other words,
you need to leave a blank line between "rows" of the table.) It can
either use
multiple spaces or vertical bars as the cell delimiter. Vertical bar
delimiter is better if there are lines in the columns that are empty.
For example if you select this ASCII table and do an Auto Table:
Carolina
Poplar |100 ft.|Grows in a dry soil. Fastest growing street
| | tree. Its dropping fruit is a
nuisance.
| | Sheds leaves early.
Catalpa |50ft.
|Lovely white blossoms in June. Seed pods
| | stay on into winter. Quick
growing.
| | Good lawn tree.
English
Hawthorn|30ft. |Flowers in June. Red berries. Grows on
| | dry soils. Slow grower. Sharp
thorns.
Linden
|90ft. |Easy to grow. Fragrant flowers. Rapid
| | grower. European species
smaller than
| | American.
Live
Oak |100 ft.|Not hardy in the
North. Grows south of
| | Virginia. Beautiful evergreen
oak. Likes
| | moist soil.
It would yield this, if "ML" is checked:
<table align='center' border='1' cellpadding='2' cellspacing='0'
summary=''>
<tr><td align='left'>Carolina Poplar</td><td
align='left'>100 ft.</td><td align='left'>Grows in a dry
soil. Fastest growing street tree. Its dropping fruit is a
nuisance. Sheds leaves early.</td></tr>
<tr><td align='left'>Catalpa</td><td
align='left'>50ft.</td><td align='left'>Lovely white
blossoms in June. Seed pods stay on into winter. Quick
growing. Good lawn tree.</td></tr>
<tr><td align='left'>English Hawthorn</td><td
align='left'>30ft.</td><td align='left'>Flowers in June.
Red berries. Grows on dry soils. Slow grower. Sharp
thorns.</td></tr>
<tr><td align='left'>Linden</td><td
align='left'>90ft.</td><td align='left'>Easy to grow.
Fragrant flowers. Rapid grower. European species smaller
than American.</td></tr>
<tr><td align='left'>Live Oak</td><td
align='left'>100 ft.</td><td align='left'>Not hardy in
the North. Grows south of Virginia. Beautiful evergreen oak.
Likes moist soil.</td></tr>
</table>
Which would display as:
| Carolina Poplar |
100 ft. |
Grows in a dry soil. Fastest growing street
tree. Its dropping fruit is a nuisance. Sheds leaves early. |
| Catalpa |
50ft. |
Lovely white blossoms in June. Seed pods stay on
into winter. Quick growing. Good lawn tree. |
| English Hawthorn |
30ft. |
Flowers in June. Red berries. Grows on dry
soils. Slow grower. Sharp thorns. |
| Linden |
90ft. |
Easy to grow. Fragrant flowers. Rapid grower.
European species smaller than American. |
| Live Oak |
100 ft. |
Not hardy in the North. Grows south of Virginia.
Beautiful evergreen oak. Likes moist soil. |
For best results you'll need to remove any leading or trailing vertical
bars before you run the function. Note: The vertical bars DO NOT
need to line up in the ASCII table for the Auto Table converter to work
correctly, they just need to be somewhere between the cells.
Multiple spaces for separators work as well, but can not compensate for
blank cells. You can only use multiple spaces as the separator if every
line in every cell has something on it.
Auto List ML is similar in that it treats each paragraph as a single
list item. Essentially, it uses blank lines to denote item breaks
rather than line breaks.
Added a basic interface to HTML tidy. HTML Tidy is a very
comprehensive HTML checker/correcter. It would be almost impossible
(and rather pointless) for me to duplicate the functionality in
guiguts. However, it has a command line interface and can be difficult
for people unused to one. This eases the interface at the cost of some
restricted customization. If you have an HTML file you want to
check, open the file, open the HTML window and click on the HTML Tidy button near
the bottom. It will run tidy on the open file and generate an error and
warning report similar to the gutcheck report. You can double left
click on an error to go to that error. and right click on an error to
remove it from the list. Note: tidy works on the open file, you don't
need to save it before running tidy. There is also an option to
have tidy automatically fix the file. If you select this option, it
will apply all of the changes and save the file to a file with a
"tidy." prefix. The open file WILL NOT BE CHANGED. I.E. If you
file is named "myfile.html" and you run the tidy modify function,
myfile.html will not be changed, the changes will be written to the
file tidy.myfile.html. The tidy.myfile.html file WILL NOT be compatible
with the page/proofer notation from the .bin file, the tidy process
will throw off the indexes.
When you run tidy on the file, it will echo the file to the
console as it is parsing it, which slows down the process quite a
bit. For large files and slow computers it may take several
tens of seconds to work it's way through the file. Note: This is
somewhat dependent on which version of tidy you are using too. I
haven't
yet figured out how to suppress the echo during processing. The
-quiet option doesn't seem to do anything.
I did not include a copy of HTML Tidy in the
download package. Visit the Tidy project page to
download the executable (or source, if that's your thing,) appropriate
to your platform.
Version
.436 (513k) Arrgh. Bug fix in .433 that fixed validation
fault bug was buggy. Inserted Hex ordinal of character instead of
decimal ordinal. (Basically inserted wrong character.  when it
should have been '. Now fixed. Only a problem if you have image
links with alt or title text that has apostrophes or single quotes in
it.
Version
.435 (513k) Added spell check dictionary select option under
Prefs menu so you don't need to run a bogus spell check before you can
change dictionaries.
Added a few hot keys to the word frequency window to allow saving of
the word lists; Ctrl+s and Ctrl+x. Ctrl+s will Save the contents of the
word frequency window exactly as it is displayed, (with the overview
and word counts,) to a file, by default named wordfreq.txt in the same
directory as the original text file. (May be changed if desired.)
Ctrl+x will eXport the the contents of the word frequency window to a
file that only contains the actual words in the list, in the order
displayed, new-line separated; no frequency counts, no overview line.
Exports to a file, by default named wordlist.txt in the same directory
as the original file. (May be changed if
desired.) At this time, the export
function will strip the asterisks off of "suspect" words in the lists,
(if there are any in the current list.) I debated just removing
"suspect" words completely from the exported list, but figured this is
probably more useful. If someone has a differing opinion, I would
be interested to hear it. The focus must be on the Word
Frequency window to use these hot keys. If the focus is on the main
window, the actions associated with the hot keys for that window will
be
performed instead. (Save and Cut) Probably not a great idea to overload
the hot keys, but there are just too many functions and not enough
keys.
I put the hot key codes in the title of the Word Frequency window so it
would be easier to remember them.
Version
.434 (512k) Fixed bug
in block wrapping code where if there were multiple paragraphs within
one block, the first paragraph would be indented one more space than
the following paragraphs.
Version
.433 (511k)
Fixed bug in image viewer initialization code where it would look in
the default directory for image files if it existed, even if a
different directory was selected in the file preferences.
Fixed bug in generated HTML image links where apostrophes were not being escaped
properly in alt and title properties. W3C validator would complain
about them.
Version
.432 (510k)
Fixed very aggravating bug introduced in .43 where script would
occasionally lock up while selecting text or rewrapping.
Modified rewrap routine to automatically convert non-breaking spaces to
regular spaces while rewrapping. I had set up the wrapping routine to
honor non-breaking spaces under the assumption that if you used them,
you wanted to maintain a certain layout. However, someone has been
converting all spaces to non-breaking spaces during proofing, and it
really interferes with wrapping unless you happen to notice and change
them manually.
Added hot key 'v' (view) to the page separator function. Will open the
image viewer to the current page. Handy for doing quick checks of
ambiguous paragraph breaks at page breaks.
Trapped warning that would come up on the console window during
the page separator function if you pressed one of the buttons (or
hot keys) while in full auto and not waiting for input.
Some functions write temp files to the disk while they are running.
If there was a problem where the file couldn't be written or
read, the script would just silently and mysteriously fail. (Usually
space or permissions problem.) Am now explicitly checking that
read and write operations are successful and popping up a warning
message if they aren't.
Version
.431 (510k)
Fixed the normal bug I introduce fixing some other bug. When
rewrapping a text, the rewrap margin for standard text would be
shifted right by one space after a block rewrap block. Now returns to
correct
indent (normally zero for standard text, but I could be set to
something
else.)
Fixed behavior under Linux where window was expanding and
contracting when the line numbers were toggled on and off.
Version
.43 (510k) Worked on Italic/Bold Word Frequency routine to have
more accurate counts on the non-marked up phrases. Still not perfect,
in a few unusual circumstances it can still be off, but it is much more
accurate in general.
Added an optional line number bar to the main text window. Right click
on the line number / column in the status bar at the bottom to toggle
it on and off. Note: having line numbering on will significantly slow
down functions that scroll the display; (Rewrap, Fix-up, etc.) Tracking and updating the line numbers adds a fair
amount of overhead. It is probably a good
idea, (though not strictly necessary,) to toggle the line numbers
off while performing one of those functions.
Tweaked page separator fix-up code to recognize and work with markup
that is split over a page break. If a page ends in a closing markup and
the next starts with an opening markup, it will delete the extraneous
markups.
Fixed stupid bug in rewrap code. When I changed the parser to recognize
block markup without a leading blank line, I accidentally introduced a
bug . It was adding a spurious blank line before all block
markups that DID have a leading blank line already. Doh!
Fixed now.
Cut out some debugging code I accidentally left in the previous release.
Version
.424 (504k) Sigh. Fixed bone-headed error in Guess Page Markers
function where it was not left padding the page names from 26-99 with
zeros.
Version
.423 (504k) Added Word Frequency Marked up phrase search. Sort
out words / phrases with italic or bold markup and display them and
similar words / phrases without markup. WARNING: Counts for the unmarked phrases may be
inaccurate, especially if they cross line boundaries. Right click on
the Ital/Bold Words button to change the phrase word limit. Marked up
phrases with word counts above that threshold will not be included.
Default threshold is 4. If a phrase crosses a line boundary, the
threshold may be off by one.
Fixed regex search results that cross line boundaries to be highlighted
correctly under Linux and OSX.
Version
.422 (504k) Fixed bug in Word Frequency spell check where it was
always using the English dictionary no matter which dictionary was selected.
Added the currently selected dictionary to the spell check window title
bar to make it easier to keep track of which is selected.
Version
.421 (503k) Modified spell checking code to not be confused by
unrecognized abbreviations that contain a number. ( 1er, though a valid
abbreviation in French, was causing problems in Aspell.)
Change spell checker to clear highlighting and word list when it is re
run while open and when changing dictionaries.
Bound F7 key to spell checker. Pressing F7 while the focus is in the
main window will run spell check on the selected text or the entire
document if no selection is made.
Added Word suggestion mode to Aspell
options. Customize how aggressively Aspell looks for possible
replacement words.
Modified suggestion list header to report
how many suggestions were returned. Allows easy comparison of the
various suggestion modes.
Version
.42 (502k) Fixed small problem where if there was a hyphen at
the end of the last line in a block rewrap block, it was moving the
last word down on to the same line as the block end marker. A fairly
rare problem, but annoying when it happens.
Worked on getting the guiguts to play nicely with the upcoming .99
version
of
gutcheck Lots of new functionality in gutcheck, which required some
fairly substantial tweaking of the interface code. Expanded view
selection list to include ALL of the
possible error queries that gutcheck can possibly emit, including some
that were in earlier versions but were very low frequency.
Rearranged view selection list to be in alphabetical order rather than
the random haphazard order it was in before. Makes it easier to find a
specific selection in the list. (NOTE: Gutcheck .99 has not yet
been released. I have been working with a beta, somewhat buggy version.
Guiguts will still work with (and ships with) gutcheck .981, but when
.99 becomes available, it
should drop in without modification.)
Played around with the proofer messaging code a bit. Now when you press
"send message" in the proofer pop-up window, if no proofer name is
highlighted, it will open a generic send message window. If a proofer
name IS highlighted, it will open that proofers profile page. (From
which you can send a PM, but which has lots of other useful information
too.) Modified it a bit to work with proofer names that contain spaces
or non-alphabetic characters. They aren't common, but they DO exist.
You can just select an entire line and the proofer name will be
extracted. (Triple left click on the proofer name.) Selecting the
entire line won't work in sort-by-page view as both round proofer names
are on each line. You'll need to highlight just the proofer name to
whom you wish to send a message.
Added another optional status bar to the bottom of the screen. If you
are working on a DP file that has the proofer names in it, and right
click on the See Proofers button in the status bar, another bar will
open that displays the names of the proofers of the current page,
updated as you move through the file.. (NOTE*** if you move page
markers around, this will be thrown off. It works by matching up the
current page marker with the proofer names stored in a hash indexed by
page number. Once the page markers are moved or reindexed, the markers
will no longer correspond to the correct hash entry. The pop-up
proofers window will always be accurate.)
Reworked sort order for proofer names to be case insensitive. It was
sorting in ASCII sort order (Capitals before lower case). Now
does true alphabetical sorting.
Fixed a bunch of problems with the link checking code. It wasn't
finding links that were broken over multiple lines. (Uncommon for
guiguts auto generated code, but common in many HTML editors and tidied
files.) Was erroneously reporting links with forward slashes as errors.
(Was supposed to be back slashes.)
Fixed small problem in rewrapping module where it was miscounting line
lengths that contained bold HTML markup. Not a big deal, but it could
affect rewrapping slightly if you had lots of bold markup..
Modified block rewrapping functions to work even if there ISN'T a blank
line on either side of the /# #/ markup. Will only use default
indents in this case though.
Added some features to the search function. If you have a word selected
in the text, and click Search (or Ctrl+f), the search box will load the
selected word and search for it. Since the search function searches
first on selected text - a previously requested feature - it will find
the selected word first, then search the file past where the word was selected. You
can manually set the cursor where you want it to start searching if
desired. If you have text that extends over one line selected,
only the first line of the selection will be used as the search term.
The rest will be truncated.
Added a "Start at Beginning" selector in the Search window to
restart the search from the beginning (or end, if searching reverse) of
the file.
Added Ctrl+f as a hot key combo to the search window. Does the same
thing as the search button or Enter key. Now Ctrl+f in the main window
starts the search function and Ctrl+f in the search window searches for
the next instance. Just because I felt like it.
Tried to make HTML autogenerate more resistant to errors caused by
customized header files.
Modified block rewrapping slightly. It
would ignore a first line indent
override if the indent was set to zero. (Fairly obscure) Fixed. Also
modified so that if several paragraphs were included in a block rewrap,
and a first line override was given, each paragraph in the block would
get the override instead of just the first.
Worked on page marker routines to be a little more tolerant of
four digit page numbers and work smoothly at boundaries (before first
page marker and after last.)
Added an Insert button to the page marker adjust
functions. Will
automatically make room for a
page marker if necessary and add it at the cursor. (The Add button will
only
insert a marker if there is already room for it.)
Finally got around to installing Linux on
one of my systems and worked
on trying to iron out a few of the peculiarities guiguts exhibits on
Linux.
Got external programs to run in non-blocking mode under Linux.
Necessitated extensive rewrite of the external program calling code,
and the external program spawning script. Changed the name of the
external script from runner.pl to spawn.pl. For some reason, the name
runner.pl seemed to confuse a lot of people and I got quite a few
questions about it.
Attempted to trap right click menu error
in text window under OSX and Linux.
Still am not able to get a right click to pop up a menu like under
Windows,
but at least it doesn't just crash the program any more when you right
click..
Modified , Upper and . Lower frequency searches to work correctly (or,
at least, reasonably
correctly) under Linux and presumably OSX.
Modified Unicode routine to give a little more indication of what is
going on under Linux. The Linux X font server takes a long to to fault
out when a particular character is not implemented in the selected
font. If there are many unimplemented characters, it can seem like the
program has hung. At least now you have an indication every 3-4 seconds
that it is still working. Not a problem under Windows. Apparently the
Windows font manager routines are much faster in dealing with
unimplemented characters.
Added another preference to the Prefs menu. Set Browser Start Command.
For Windows, it is probably best to leave it as 'start'. That will
start whatever your default browser is. Otherwise, enter the full path
to the executable. This will allow non-Windows users to customize the
browser start command for their OS.
Version
.411 (499k) Sigh. I had made some changes to the link generation
code and changed the link searching code to find the newer links, but
broke searching for older style links in the process. Fixed.
Version
.41 (499k) Fixed minor problem in word frequency, ". Lower"
search where searching on the results would occasionally return
unexpected results under some circumstances.
Added a "Surround Selection With..." function. Replaced "Insert
_ _ Around Selection" function. It does the exact same thing
except the text that is inserted around the selection is
editable. (It is still _ _ by default.) I was
contemplating adding a bunch of single purpose functions to add
various markups, but realized this was much more flexible , ( and
less aggravation in the long run.)
Fixed File Open dialog parameters for Linux thanks to a bug report and
patch submitted by Gregory Margo. Probably will fix nagging problem
under OSX too.
Worked on HTML internal link function a bit, to be better about
guessing the anchor you are trying to link to. Much better about
finding exact matches, much better about finding not exact, but close
matches. When you select some text to be an internal link, it will try
to find a named anchor that has that exact wording, case insensitive,
and punctuation removed. Next, it will find all of the named anchors in
the file that have some of the words in the selected text, (excluding
the, a, and & to) and list them. Finally, it will list ALL of the
named anchors in the file. If you have fewer than a hundred or so
named anchors, this is probably overkill, but I just completed a
project with nearly 1800 named anchors and 3900 internal links and this
was a real sanity saver.
Added a rudimentary internal link checker to the HTML fixup page. This
will find all of the named anchors, internal & external links and
image links in your file and and will: list totals of each, links
without anchors (this is a critical error, you've got an internal link
that goes nowhere,) anchors without links, (informational, you will
probably have several, especially if you have included page anchors,)
and all image links (check the image links to make sure they are all
lower case to comply with Gutenberg standards, and that they are all
relative, not absolute links. Will warn if there ARE any image links
with upper case characters.) and external links. (This is probably an
error unless you have a multi-volume, cross-linked text.
Normally, external links are frowned upon in Gutenberg texts.) It will
also list any links that have any spaces or backslashs in them (or
their numeric equivalent) ' ', '%20', '\', '%5C', which are almost
always a mistake.
Fixed (or at least worked around) annoying bug where the HTML image
function would not let you edit the title or alt text until you shifted
focus away from the window and back. (By the simple expedient of
automatically shifting focus away when it is opened.)
A bunch more useful patches from Gregory Margo:
Modified various word frequency routines to not include page separators
in the word counts. (I was of two opinions about this but went
ahead and added it.)
Modified spell checking to skip page separators. (This was definitely
useful. I can't think of any reason to spell check the page separators.)
Modified spell checking process handling to be more Linux friendly.
Dictionary handling is much cleaner. (Also helps under windows and
presumably OSX as well. No more zombie processes hanging around after
you run spell check. Yay! :-) )
Big thanks to Gregory!
Made spell check re run automatically if dictionary is changed. No
longer necessary to close and re open spell check.
Fixed a rather egregious error in the spell check function. The Change
All button would change ALL of the words queried by the spell checker
to the first suggested replacement. Almost definitely not what you
would expect (and not what I intended.) Now will change all occurrences
of the PRESENT word to the selected replacement.
Version
.403 (493k) Arrgh. Neglected to escape forward slash in line
6389.
Version
.402 (570k)
Fixed problem in spellcheck where it wasn't replacing the misspelled
word completely.
Changed the poetry HTML autogenerate to use markup that would be
friendlier to non CSS aware browsers. Also changed <br> and
<hr> to use XMLish markup; i.e. <br /> & <hr />.
Change Unicode information file to a save method that will be
compatible with big-endian byte order OSs. (OSX)
Version
.401 (493k)
Changes I made to fix regex \n search in files with multi byte Unicode
characters broke \n searching in files without them. Inserted a switch
statement in the code to use appropriate character counting scheme
depending on whether the file contains multi byte characters or not.
Fixed minor problem in search popup where selecting Regex wouldn't
automatically unselect Whole Word. The Regex and Whole Word are
mutually exclusive. Setting both will never return any results.
Version .40 (493k) Version .40
needs to use the perl runtime libraries version 3 prl03.zip
(5772kb). If you already have prl02, you can update to the prl03
package by just downloading the prl03update.zip
file (52kb) and unzipping it in the directory where your current prl
directory is located. (Less than one hundredth the size.)
Finally
worked out a way to get more intuitive sort order for Latin-1 accented
characters in the word frequency routines without breaking Unicode
compatibility. Not as elegant as I would have liked. Basically, brute
forcing the sort routine. Nearly doubling the time the sort takes.
Still, the trade off is acceptable IMO. Except on very large files or
very slow computers, the sort still only takes a few seconds.
Added two more buttons to the word frequency routine to check for
instances of a comma followed by an uppercase character or a
period (full stop) followed by a lower case character. These will
find all instances of these whether they cross line boundaries or
not. If they DO cross a line boundary, the newline will be
represented by "\n". You will be able to search for these using right
and left clicks in the word frequency window, terms that have a newline
in them \n may take a few seconds to find the first one when using
right click. You may still need to manually check for paragraphs
ending with a comma. Here is a nice regex to do so:
',("?\n{2,})' => '.$1'
Added a Initial Caps sort function to the
word frequency window by special request. Sort out all of the words in
the file that have the first letter capitalized, and at least one other
non upper case character.
Twiddled around with the sort and parse logic for many of the word
frequency functions to try to speed them up a bit.
Removed the Re Sort button from Word
Frequency window. It was
contributing to the large memory footprint significantly, due to the
state information I needed to keep in memory to know WHAT to re sort.
It has been replaced with All Words, which is more useful, in my
opinion. It allows you to get back to the full list without having to
do a full search and count sequence.
Added Unicode > FF button to the word frequency window. Sorts out
all words that contain characters over hex FF. (Outside of
Latin-1).
The sort order is by ordinal for characters over FF so the display
order may seem unintuitive.
Rewrote the Search Stealth Scannos routine to
no longer use recursion.
It wasn't really hurting anything, but it would pop up warnings about
it on the console if it recursed more than 256 levels. (Which was
pretty easy to do.)
Tweaked the search code to scroll the end of the found term completely
onto the screen on search. It was set up to scroll the beginning of the
found term onto the screen, which was fine for single line searches,
but multi line searches would often end up with the found term being
half on, half off the screen. This will ensure that the found term
(unless it is exceedingly long) will be completely visible after a
search.
Added the number of words left to check in the spell check pop up
window
title bar. This is, the total number of words in the unrecognized list,
not just the unique words. If you Skip All or Add To Dictionary, the
count will be reduced by the number of times the word appears in the
list of misspelled words.
Added a tool bar. Put some of the often used routines on it. I'm not
really sure if this is necessary or even particularly desirable, but it
was interesting to play with. I am open to suggestions as to which
functions should be accessible through the toolbar. If you don't like
or want the tool bar, disable it under the Prefs menu. You can drag the
toolbar to the side of the window you want it to dock on, or drag it
onto the desktop to use it as a floating widget. Select the side you
want the toolbar to start on under the prefs menu.
Fixed error in stealth scanno editors' save code where it was
incorrectly escaping backslashes for the hint index terms.
Fixed file save functions to save text with Unicode characters as
Unicode, and text without as Latin-1. Was saving text with Unicode as a
bizarre blend of Latin-1 and Unicode. Had to rework the gutcheck
functions to feed it the bizarre blend, as it doesn't like Unicode AT
ALL. As a result, gutcheck will no longer save the file when you run
it. (Actually, it does, but it saves it to a temp file, runs gutcheck
on THAT, then deletes it again.)
Had to reconfigure the spell checking code to deal with the new file
save code a bit. Aspell 0.50.03 doesn't handle Unicode characters
well at all. I did some work arounds for the word frequency spellcheck
routine, but the main spell check will choke on words with non Latin-1
characters. Apparently Aspell version 0.60 is due to be released soon,
and that has Unicode support built in. It is available in beta, but you
must build both it and the dictionaries yourself, so it is not really
recommended yet. As soon as it is officially released, I will probably
change over to it.
Modified HTML character code to use <center></center>
markup for centered images. Apparently the align="center" attribute is
not very well supported.
Modified the file save logic to let you save a file, even if you have
made "no edits". There are quite a few functions now that bypass the
undo buffer and so don't raise the "edited" flag. As a result, there
are many occasions when you might want to save the file, even if
guiguts doesn't know it has been edited. This change allows you to
without having to make a bogus edit to set the flag.
Fixed the regex engine to search for terms with newlines to work
correctly with files that contain multi byte Unicode characters.
It was blindly counting every byte as a character, so would get
unsynchronized when there were Unicode characters in the text.
Went through code trying to tighten up variable scoping and reduce
memory footprint. Guiguts, by it's nature,
is a memory hog. Don't know if I accomplished
much, but *I* feel better.
Added an ordinal readout to the bottom status bar. Displays the ordinal
of the character just to the right of the cursor in both decimal and
hexadecimal. If you click on it, it will toggle showing the name of the
character as well.
Fixed problem where thought breaks could get munged during rewrap under
certain fairly rare circumstances.
Tweaked spell check and word frequency routine to not NEED to have
files saved before they can be run. For best results, you SHOULD save a
file that has been edited before you run them, but now, if you just
want to run a quick spell check or word frequency analysis on a new,
unsaved file, you can.
Version .39
(485k) Added some
functions that are not supported by my first release of the perl
runtime libraries. Since I had to release a new prl, I took the
opportunity to update Tk to a newer version with several important bug
fixes. You will need to update to prl02.zip to be able to run version
.39 (Or install Tk804.026, Image::Size and Tk::Toolbar in your local
perl package.) Guiguts version .39 WILL NOT WORK with prl01.
Heavily modified the image handling code for HTML generation. It is
still not automatic, but much more of the work is done for you. Added a
button to the HTML fixup window; "Auto Illus Search" that will scan
through the file and semi-automate HTML image code insertion. Any
text inside [Illustration: ...] markup will be placed in an alt=" "
tag, and you will be able to see/adjust the size, alt, title, and
alignment properties. Once you select an image file, a thumbnail image
will be placed at the bottom of the Image Selection box. Click on the
thumbnail to change the file. *Note: file names will be displayed as
absolute in the Image Selection window. As long as the image file is
located in a sub folder or the
same folder as the HTML file, it will be converted to a relative path
name when it is inserted. The thumbnail size will vary somewhat with
different image sizes. Tk does not have any easy mechanism for scaling
images smoothly to any ratios besides n-1, where n is
a positive integer, which makes it difficult to scale them to an exact
size. For simple thumbnail previews, where it is not critical what the
actual size is, as long as
you can see it, it works ok though.
Tweaked the Unicode -> Beta code & Beta code -> Unicode
functions under the Greek transliteration a bit to give more consistent
results. Added the Greek letters with a tonos accent to the translation
code. Tonos is essentially the modern equivalent of acute accent in
ancient Greek. There is some overlap between characters with tonos and
characters with acute, but there are a few unique combinations too.
Added them for completeness.
Rewrote the character builder nearly from scratch. Expanded the
capabilities quite extensively. Made it able to generate all Greek
characters. Just type in the character you are looking for into the
character builder box and the corresponding Greek character will appear
in the box next to it. Press enter to accept the character. If you
press enter with an empty character builder box, it will place a line
return in the Greek text window. If you press backspace in an empty
character builder box, it will backspace one in the Greek text window.
You can add a space by adding a space in the builder box and pressing
enter. To get a terminating lowercase sigma, type in "s " (s+space) in
the character builder box. (Or you can just put a standard sigma and
then do a back and forth transliteration with the ASCII or Beta code
buttons.) You can get a ô (transliteration for omega) by typing
o^ in the builder box. Likewise, ê (eta) can be obtained by
typing e^. You can also use H & h as aliases for upper and lower
case eta and W & w as aliases for upper and lower case omega in the
character builder.
Rewrote the menuing code for Unicode character windows. Made it much
more compact and maintainable. Added the hexadecimal character range
that each block covers for those times when you know a characters index
number but not which block it is in. Made the sort order of the Unicode
menu selectable by block range name (default) or by block range index.
Tweaked named character conversion during HTML autogenerate to convert
strings of em dashes properly.
Fixed a bunch of other minor buglets.
Version
.382(482k) A bunch of
minor tweaks.
Tried to fix spell check in word frequency to recognize words with
Latin-1 accented characters again. Broke Unicode compatibility in the
process. Seems they are mutually exclusive right now. Oh well, right
now, Latin-1 spell check is more important to me than Unicode spell
check.
Fixed oddity in page separator routine where it would open a different
page than you would expect when you pressed See Image. Now opens the
page that you are currently working on.
Work on page marker moving and reindexing routines. Made them a lot
less fragile. It was very easy to break them before. They are much more
error resistant now.
Worked on internal link sorting code to work
better with
numerical link names. Really improved the ease of index hyperlinking.
Fixed the hide footnote and hide page number link view options.
Note: now that the page marker reindexing tools are working rather
well, you can hyperlink a fairly large index in a very short time. You
need to make sure your page markers are aligned with the actual pages
in the text. (Time consuming, but worth it.) Double check the numbers
in the index against the original. (Very time consuming, but necessary,
I've found. :-( ) Make sure you select "Insert Anchors at Pg #s" when
you auto generate, then do a regex search and replace
'(?<!\d)(\d{1,3})' => '<a href="#Page_$1">$1</a>' in
the index, ((\d{1,4}) if your text uses 4
digit page numbers). Press enter to search
for the next page number and Ctrl+Enter to replace and search. Once the
preliminary setup is done, you can hyperlink an index in minutes.
Added /f f/ to the rewrap marker cleanup
routine.
Fixed problem in HTML auto generate where it was not adding all of the
poem styles if there were multiple levels of indent beyond 4. Fixed
syntax error in generated indent styles.
Fixed problem where autogenerated TOC entries would sometimes end up
with markup inside them.
Tried to trap a few more instances where orphaned markup could be
generated. Still not perfect, but I'm chipping away at it.
Version
.381(480k) Fixed self
inflicted gunshot wound to foot. When I made changes to word sorting
routines to allow for Unicode characters, I broke the code so it wasn't
allowing words with mixed characters and digits, thus rendering the
Mixed
AlphaNum word frequency absolutely worthless. Now repaired. Thanks Aria!
Version .38(480k) Version .38
.bin files are NOT backward compatible with previous versions. I have
slightly modified the formats of a few stored variables. The only
one that would probably be a problem is the page marker hash. Earlier
versions used pg001 as their key format, it has been changed to Pg001,
uppercase first char. .38 will automatically convert earlier versions
forward. If you want/need to go back, you may need to manually reset
the page marker hash to use the earlier format. (I didn't do this
lightly, it really made other things much easier having done this.)
Reworked
Title Case function to work a little better with quoted words.
Started working on method to view and adjust page number markers after
the page separators have been removed. Right click on Page # in status
bar to toggle visible page numbers. Click on a page marker to adjust
the placement. You can also adjust the offset of the various page
markers and add and remove page markers. The functions TRY to disallow
undesirable results. They will not allow you to add a page marker if
the page markers on either side of it are only 1 apart. You can
INCREASE the offset (nearly) unlimited amounts, but it will only
decrease offsets until any gap is closed, no matter what you enter for
the decrease offset amount. (IE Say you have page markers that range
Pg004, Pg005, Pg009, Pg010. You can decrease the offset of marker Pg009
by -4. If you try to decrease it by -10, it will only do -4.) When
adding page markers, Place the cursor where you would like the mark to
go and press Add. It will look at the previous page marker, increment
it by one and place it at the cursor as long as it doesn't already
exist.
Right now there is no mechanism to change the png image names to match
the changed page marker names, and I doubt that there will be one in
the near future, at least, not one that is tied directly to the page
marker adjust functions.
Changed named anchor generation code to strip apostrophes from the link
name. Apparently, some code checkers were complaining about it, though
the W3C validation service seemed to have no problem with it. (On
further testing, W3C seems to ignore it sometimes and complain others.
Ah well, removed anyway.)
Added two functions under the Selection menu, Convert To
Named/Numerical Entities, Convert From Named/Numerical Entities.
These will convert selected text in the main window to and from
HTML encoded text.
Note: these functions will NOT add or remove any HTML markup, they only
convert named
and numeric entities to and from HTML style encoding.
Worked on Word Frequency Accent check to try to catch more variations
of spellings in the suspicious category. Semi successful.
Tweaked rewrap code a bit to try to compensate for possible
interference between rewrap markers and page markers.
Fixed problem with /F F/ code not terminating during rewrap.
Added some more buttons to the Greek Transliteration window;
ASCII->Greek and Greek->ASCII. These will take the selected text
in the Greek transliteration window text box (or all of it, if none is
selected,) and try to transliterate it using the rules from the site.
(Except using ê for eta and ô for omega to distinguish them
from epsilon and omicron.) (As an aside, this is a variation of the
Perseus system for transliterating Greek.) These functions are not
perfect, and shouldn't be used blindly, but they will do perhaps 95-98%
of the work in transliteration. The U/Y transliterations are suspect
especially, since they are somewhat subject to interpretation, this
will lead to having text that is not 100% reversible. If you plug some
English text and transliterate it back and forth, you most likely WILL
NOT end up exactly with what you started with. These functions are
somewhat inefficient, and probably shouldn't be used on chunks of text
larger than about 10-20 K at a time. For small passages, they work ok
though. The punctuation transliteration is also suspect and may be
changed after some testing and feedback. **Note: the Greek auto
transliteration functions will only be available if you have Tk 804.025
installed in your perl libraries.
Added two more buttons to the Greek transliteration window.
Unicode->Beta & Beta->Unicode. These implement a subset of
beta encoding to allow more detailed markup of accented Greek if you
should desire to. For unaccented characters, the transliteration is the
same as the Perseus method (What we use on the site and guiguts has
used up to now.) Beta encoding provides a method to preserve the
accents. There are basically eight accents that you need to deal with
for Greek, they are detailed below: (You will need a Unicode aware font
to view the examples in the chart.)
Popular
name
|
Greek
name
|
symbol
|
example
|
encoded
|
| rough
breathing mark |
diasia
|
(
|
ἁ
|
a(
|
soft
breathing mark
|
psili
|
)
|
ἀ
|
a)
|
acute
|
oxia
|
/
|
ά
|
a/
|
grave
|
varia
|
\
|
ὰ
|
a\
|
iota
subscript
|
prosgegrammi
|
|
|
ᾳ
|
a|
|
tilde
(or inverted
breve, depending on the font)
|
perispomeni
|
~
|
ᾶ
|
a~
|
diaeresis
(rare)
|
dialyctika
|
+
|
ϋ
|
y+
|
breve
(rare)
|
vrachy
|
=
|
ᾰ |
a=
|
macron
(very rare)
|
macron
|
_
|
ᾱ
|
a_
|
To encode a character in beta code, transliterate the base character
as normal. Then, starting from the highest point, working from left to
right, place the symbols for the various accent marks after the base
character. Stack as many accent symbols as needed to make the
character. IE: ᾭ would be Ô(/|.
There is a utility box at the bottom of the Greek
transliteration window to help assemble accented Greek characters.
Select the base character and accents you want from the list and press
enter to place the character in the transliteration window. **Note: for
purists, this is not EXACTLY beta code, as beta code uses uppercase
letters for all Greek letters,
which, by default, means lowercase. :-? To encode an uppercase letter
you are supposed to precede the base character with an asterisk. I
elected not to do this, as we already have too many overloads for
asterisk. The accent encoding is pretty close to standard beta encoding
though.
The character builder is not yet capable of dealing with combining
characters to make Unicode characters, it only indexes combination
characters which are already defined in the Greek and Coptic and
Extended Greek ranges of Unicode. So it is possible, (quite easy in
fact,) to come up with a diacritical combination that it can't handle.
If you want to look at the characters that it can handle, open the
Greek and Extended Greek windows under the Unicode menu.
The character builder is somewhat of a pain to use, (hence the name,
Character builder, by using it you build character(s). :-) ) I haven't
been able to come up with a better way yet though.
Fixed a few problems with sidenote conversion during HTML auto generate
that could hang the process.
Fixed aggravating bug in rewrap code in pre release that was leaving
control characters in the text.
Version
.372(467k) Fixed
problem where under certain circumstances, the Goto Page pop up would
be unclosable.
Fixed problem where /F F/ markup was not being recognized.
Fixed problem with unclosed paragraphs before /* */ blocks during HTML
auto generate.
Version .371(467k)
Under
certain circumstances, if you opened and worked on more than one file
in a guiguts session, without restarting, the bin files could get cross
contaminated, resulting in spurious page markers and bookmark markers.
Version .37(467k)
Started trying to wrap my head around OOP in perl (I don't need
any
snide comments from you python people either. :-\ ) Derived my own
class of text widget and overrode the Load and Insert methods to deal
with Unicode characters. No longer any need to patch the default perl
packages to be able to load files containing Unicode.
Wrote some new insert and delete methods which bypass the undo buffer.
I am using them in the HTML auto generate functions, since they work
2-3 times as fast as the methods which manipulate the undo buffer. They
also cut down on memory usage quite a bit. The caveat? (You knew there
was going to be one...) SAVE YOUR
FILE BEFORE YOU RUN HTML AUTO
CONVERT! You can no longer undo back out of it.
Fixed problem with auto generate poetry line numbers inside /p p/
markup.
Rewrote a bunch of the header.txt header file after discussion with
some users. You are still free to modify it locally, I just changed to
more logical defaults.
Added markup and auto generate code for sidenotes. Thought I had done
this before, but I guess it never got into the distribution package.
Added tool tip pop-ups to the Unicode windows. The tool tip displays
the decimal and hexadecimal ordinals of the Unicode character the
pointer is over.
Added feature to the Unicode tool tips. Now also pops up the name of
the character as well as the decimal and hexadecimal index. Will pop up
tool tip with information, even if that particular character is not
implemented in the font you are using. *Note: this feature is somewhat
resource intensive, and will cause Unicode windows to load slower the
first time one is called in a guiguts session. If you want to disable
it, search for the file 'Unicode' in the guiguts directory and
rename it to something else. You can rename it back if you want to
re-enable it again.
Made guiguts automatically convert files to native line endings on file
load. *Note: just converts the file in memory, not the file on the
disk. When you save the file, it will be saved with native line
endings.
Fixed missing anchor closing markup in page anchors.
Added a summary attribute to the auto table generator function in HTML
fixup.
Propagated icon to all sub windows that get opened in guiguts. It was
only being shown in main window.
Added a unique identifier to footnote anchors allowing you to reuse the
same footnote numbers/letters multiple times in one file. Useful if you
set up your footnotes index to start over every chapter.
Cleaned up a whole bunch more poor code. Broke a bunch of stuff in the
process. :-( Went back and fixed everything I broke.
(hopefully...)
Changed bindings for Control-A to be Select All to be more like nearly
all Windows programs.. Control-/ will still select all too.
Did some more work on HTML auto generate. Poetry inside footnotes
should now be converted correctly.
Made word frequency routine aware of non-breaking spaces. Listed as
*nbsp*. It knew they were there before, it just didn't know what to do
with them.
Modified word frequencys Mixed Case sort to be Unicode aware.
Added a way to check to see if you have duplicate anchors in your HTML
file. Duplicate anchors can cause problems for browsers and will
make validating checkers complain. In HTML fixup, without
selecting anything in the text, press Internal Links. It will check
through the text for duplicate link names and throw a warning if it
finds any. If you don't get a warning, your file is Ok.
Modified Latin-1 popup window to be selectable to enter either the
literal character or HTML named entities. Added pretty much the rest of
the printable Latin-1 characters to the chart while I was at it.
Non-breaking space and non-breaking hyphen are the only ones not there.
Added the pop up tool tips to the Latin-1 Chart too.
Added /p p/ and /# #/ searches under the Search menu.
Added /p p/ search to the orphans search pop-up.
**Note: Gutcheck will complain about /p p/ poetry markup. "Paragraph
starts with lower case." The easy work around is to use /P P/
(upper case P) instead. The markups are identical as far as guiguts is
concerned.
I have removed, (for the time being, at least) the Korean, Japanese
ideograph characters (CJK Unified Ideographs) from the Unicode popup
choices. There are over 21000 characters in the chart, and trying to
load it, with the method I was using, was sucking up all
available memory and crashing the program. If we need to start using
Korean / Japanese ideographs, I'll need to figure out a different way
to display them.
Added another special markup /f f/ or /F F/ - front material. Only
should be used a the front of a text around the tile, author,
publishing data etc. In text rewrap it is treated just like /$
$/;
IE no rewrap, no indent. In HTML autogenerate, it will allow the
standard title and author markup, but will center everything else
within the block. Not strictly necessary, but a time saver.
Modified Replace All function in Search and Replace to work with
selections. If you have a selection of text, Replace All will only
operate within the selection. If you don't have a selection, it will
Replace All on the whole file.
Fixed problem with checkfil.chk temp file not being deleted when spell
check was done..
Played around with the status bar looks a little bit. I like the
changes. Made the line number, page number & mode in the status bar
respond to left
mouse clicks. Pops up the Goto Line, Goto Page windows
& toggles mode, respectively. Changed the Images and Proofers
buttons to match.
Made some changes to the word frequency filtering routine to make it
better able to deal with non English texts.
Fixed the Alphabetical sort order in word frequency to ignore case when
sorting. It had gotten changed due to programming modifications I had
to make
to accommodate Unicode characters, and was yielding unexpected results.
Added a "Suspects Only" option to the Word Frequency window. Will
filter result to only show suspect word pairs in searches that return
them. You will need to re run the particular search if you change the
state of Suspects Only, Re Sort does not understand the filter.
Have added another preference setting "Auto Set Page Marks On File
Open". This has been the default setting for several versions,(and is
STILL the default setting), you just have the option turning it off, if
desired. For very long texts, it will speed file load by a significant
amount. If your file already has page markers set, you don't need to
reset them every time the file is loaded. I would recommend leaving it
on by default, unless you are working with VERY large files. You can
still run the set page markers routine from under the File menu.
Version .363(342k)
Fixed a major foofoo in HTML orphans check Seemingly minor
change had major ramifications.
Version .362(339k)
Fixed a bug in the external commands routine that would cause some
commands not to work inexplicably.
Modified HTML generating code for poetry markup to avoid some fairly
obscure errors.
Added a Versions command under Help menu, mostly for easy
troubleshooting. Reports on most relevant version numbers.
Added an icon. :-) I included a gg.ico in the distribution package that
you can use for a Windows icon should you so choose. Also available as
gg.gif
Rewrote major portions of search code. Exact same functionality. Much
more compact and readable.
Fixed some other really bad code. Hopefully, close scrutiny of the
source will now only induce nausea instead of making you violently ill.
Version .361(338k) Fixed
bug
in
fixup routine where it was removing a character after a thought
break.
Fixed obscure problem where rewrapping would sometimes yield odd
results
at page boundaries if page separators were removed manually (select and
delete) rather than using the page separator tool or if there were lots
of blank pages in a text..
Fixed problem where file names with multiple full stops could result in
truncated (wrong) file names for the project info and project
dictionary files.
Fixed bug in .bin file save code that was causing problems on unix
based OSs. Was making faulty assumptions about path separator character.
Fixed a bug with regex searches with a newline character in them that
would only return one match on any particular line.. On the face of it,
that doesn't seem like it would be much of a problem. You would think
that, if a search term has a newline in it, it should not occur more
than once on any single line. However, if you searched for a term that
MIGHT have a newline, it would ALSO only return the first occurrence on
any one line. A relatively obscure, but quite annoying bug.
Modified \n assertion special regex search code to allow forward and
reverse searching like the regular searches. Regex searches with
newline assertions will now respond to directional searches.(Previously
they were forward only.)
Changed the HTML autogenerate option for adding page numbers as
anchors, to ONLY add them as anchors, rather than adding the page
numbers AND anchors.
Changed the Unicode pop up window code a bit to have less scrolling,
not sure if I like the change or not.
Version .36(337k)
Downloaded, compiled and installed beta 14 of PerlTk 804.025 and
started forward porting. Unicode, here I come.
Went through the script making changes to bring it up to spec for
tk804. Most of the changes necessary are due to tightening of rules for
variable type definitions and parameter naming conventions. Everything
I have had to change so far is backward compatible with tk800. Tk804 is
much stricter about type mismatches.
Had to rewrite recent files and external functions menuing code to work
under tk804. After many hours of frustration, gave up on elegance and
went for brute force. On reflection, I probably should have done this
in the first place, since it made it possible to merge a lot of special
purpose menu handling code into one general purpose menu handling
routine. Cleaned up a lot of code in the menu building function to make
it more uniform and readable (and smaller, incidentally).
Rewrote the Greek transliteration tool nearly from scratch. What I had
worked, but it was bulky and hard to maintain. Tore out nearly 2000
lines of code and replaced it with under 200, with the same
functionality.
Added option to output utf-8 encoded characters to the Greek
transliteration tool. (Only available if you have tk 804 installed.)
Went through and did some cleanup of inefficient code (space wise).
Broke apart several functions into sub functions that can share a lot
of
code.
Battled the word sorting functions to make them work with a mixture of
Latin-1 encoded letters and utf-8 encoded letters. Think I have reached
an uneasy truce, though it needs to be tested across platforms.
Added Drag 'n Drop capability for opening files. Open up a guiguts
instance and open up an explorer window. Drag a file from the explorer
window to the guiguts window. Voila!
Fixed bug where a single line inside /* */ markup would not get
indented by the rewrap function.
Fixed bug where Fix-up was adding an extraneous space between strings
of hyphens and double quotes.
Fixed a few problems with the auto generated footnote markup that
wasn't
passing the w3c validater.
Added a function to convert utf encoded characters to numeric HTML
entities during HTML auto generate.
Fixed single space indent to not add spaces to blank lines that are
part of the selection, thus avoiding adding "space at end of line"
errors.
Fixed problem with aspell.conf file update where dictionary changes
were being appended to the end instead of overwriting previous settings.
Fixed word frequency em dash check to ignore HTML comments.
Fixed problem where named entities in auto generated TOC were not being
converted during auto generate HTML.
Fixed /# #/ marked up text to be enclosed in appropriate HTML markup
during auto generate HTML.
Modified spell check in word frequency to allow you to add words to the
project dictionary. Highlight a word in the word frequency window and
press Control+Right Mouse Button to add it to the project dictionary
and redisplay the list.
Added a new experimental markup for poetry; /p .. p/ (or /P ..
P/) during text operations, (rewrapping) it is treated just like /*..
*/ markup except the default indent is hard coded to 4 spaces rather
than being adjustable. During HTML auto generation, text enclosed
in /p .. p/ markup will use special poetry markup similar to the markup
Jon Ingram proposed for the Mirror periodicals. In order for the /p
..p/ markup to be converted properly to HTML there MUST be a minimum of
four spaces of indent. (default if you wrap the text with that markup.)
Added lots of Unicode character popup charts. Somewhat buggy in that
they rely on the display font having the characters implemented. If
characters aren't implemented, they display as an empty box. Every
character that is not implemented takes a short while to time out when
it is trying to display, so if there are a LOT of unimplemented
characters in the block of Unicode you are trying to display, it may
take a long time for the pop up window to show up. In the meanwhile,
the program appears to be frozen up. ***Note: Unicode functions
will only be
available if you have Tk804 installed on your system.
Tracked down problem where a section for rewrapping that had three or
more blank lines following it would through up a bunch of warnings on
the console window.
Fixed problem where rewrapping a file that had three or more blank
lines at the end would fall into an uninterruptable loop.
Rewrote part of Separator Fixup code to work around problems in
Tk804 that were causing it to run at glacial speeds.
Fixed problem with fix that would cause part of page separator to
remain if you opted to insert page numbers as HTML comments.
Fixed bug in spell check code that was causing spell checker to appear
to fail when it encountered an unrecognized word with underscores.
Added "Goto Page..." function to search menu, right under "Goto
Line...". Much as you might suspect, allows you to jump directly to a
page. Will only work if your file has page markers set.
Fixed problem in auto generate HTML code that would add extraneous bold
markup to already bolded text in the generated TOC.
Added a function to quickly enter a thought break
"
*
*
*
* *" under fixup menu. Will
automatically add it at the end of the line the cursor is in.
Overrode the default binding for Control-f to make it run the search
function instead of move the cursor right one space.
Control-f and Control-F are equivalent.
Added binding for Control-S to also save the file (in addition to
Control-s.)
Made many of the pop up windows stay on top. (At least, on top of
guiguts.) Works under windows, may not work cross platform. A drawback
is that the popup windows no longer have their own process on the task
bar. However, raising any of the popups will raise all of them. Also
added a option under preferences to enable or disable this feature.
Worked on /$ $/ block handling in HTML auto generate to try to
eliminate odd errors.
Fixed problem with page separator code locking up if there were several
page seps at the end of a file with nothing between them.
Fixed problem with scannos directory path being DOS formatted for all
OSs.
Finally fixed misspelled Interrupt Rewrap button that has been
misspelled for an embarrassingly long time.
Added another popup window to the footnote fixup function that will
display and highlight footnote anchor pairs that are suspected to have
a problem.
Changed the default for footnote fixup to be out-of-line.
Rewrote font choosing code. Now have drop down list of all fonts that
are available on your system available as options in drop down list.
Made Unicode pop up windows have selectable and resizable fonts.
Unicode window and main window do not need to use same font.
Fixed oddity in word frequency window where word with an emdash would
have incorrect word counts.
Changed right click context menu pop up to mirror the main menu. It has
always been there, but it was popping up the default menu which was
confusing at best.
Added option to output HTML numeric entities directly from the Unicode
pop up windows.
Fixed scrolling behavior in several popup windows.
Version .35(259k) Fixed
boundary
condition error in page separator fixup code.
Worked heavily on HTML functions. Added div and span buttons, worked on
file parsing code tried to automate title and author markup. Made it
generate a Table of Contents link list. Added some more styles to the
header file to deal with page numbers and poetry line numbers. Change
line number code to display line numbers as well as make anchors.
Added a pop up status window while auto converting, to let user know
that it hasn't frozen, it is just working.
Fixed bug in recent files list where files that were renamed and saved
were not being added to recent files.
Fixed boundary condition bug in recent files list too.
Expanded recent files to 10 instead of five.
Modified special \n regex searches to start from scratch each time you
cycled through the file. The way it was, it only actually searched for
ALL of the instances of the assertion the FIRST time you searched on
it, it would save all of the hits to an array if indexes and then cycle
through the array. It still does that, but now it resets the search
each time through so that any edits you may have made to the file will
be picked up the next time through. It wasn't compensating for edits
and was pointing to the same indexes, even if they had been changed.
Fixed bug in guess page markers code that was preventing it from
running.
Version .341(260k)
Realized I forgot to change the version
number in the title bar from pre.34. Ooops. Oh well. Now you've got .341
Added default indent setting for /* */ markup. Set the default indent
under Prefs->Set Rewrap Margins. The local overrides will still work
( /*[6] .. */ ) but now you can set a default other than 0. If
you want to not indent, set default to zero.
Added \t tab interpolation to the replacement text for regex search
& replace. Any \t will be replace by a tab in the replacement text.
Fixed a bunch of problems that were causing trouble for perl/Tk 804
series as reported by khandy. I'll need to take care of them eventually
since I plan to make the move myself in the relatively near future.
(The 804 series supports Unicode much better.)
Fixed the problem with extracting \n assertions for replacement
substitutions while searching with \n regex assertions. IE,.
'(\W)to([\s\n])he(\W)' => '$1to$2be$3' will now work
correctly. Quote from .34 release notes: "You can not capture
newlines for replacement terms. That kinda sucks, but I don't see an
easy way around it (right now at least)." Turns out I needed to
add one (one!!!) character to the script to get it to work......
No easy way around it indeed. Ah well..... It works now anyway.
Version .34(259k) Added check for possible emdash errors
to
Word Frequency hyphen check. Will now list all of the words that
contain a hyphen, all of the words that are identical except that they
DON'T have a hyphen and (new) all the words that are identical except
that they contain an emdash (two hyphens).
Added a "Check Emdashes" function to the Word Frequency window.
This is somewhat misleading. An emdash phrase is, by definition, not a
word, so having it under word frequency
is not absolutely accurate. I was forced to allow a lot of non-word
characters through to cut down on unacceptably high false positive
error reporting. The addition of this function probably adds
about 10-15% to the initial processing time of the word frequency
routine, but it is useful enough to justify it, I believe.
Added an option under the Prefs menu to disable the automatic
highlighting of pairs of quotes, single quotes, brackets and
parentheses, if desired. Some people found it distracting.
Fixed bug in File open code where if you already had a file open that
had edits, and opened another file, it was discarding the edits without
notice and opening the new file.
Reformatted the Greek Transliteration tool from a vertical to a
horizontal layout. Removed whitespace from character images to reduce
display size of tool. Redid the "Rough Breathing" accents to be easier
to discern. They looked too much like acute accents. Changed the HTML
code generation to produce the correct accented character codes. Added
a text box
to the bottom of the Greek tool. Assemble the Greek phrase in the
transliteration tool the transfer it over to the main editing window.
Either cut and paste, or place the cursor in the main window then hit
"Transfer".
Made a bunch of minor improvements the spell check handling code. Due
to
the way the spell check interacts with guiguts you could get odd
results
if you have made a bunch of unsaved edits before running spell check.
To
prevent this, I made the spell check code save the file if you have any
unsaved edits when you run spell check. Just something to be aware of.
Added ability to check for mismatched European style angle quotes
« »
(guillemots) to the Find Orphaned Markup and Brackets. It is not
really intuitive, it being there, but it easily uses the same search
code, so it was a win as far as programming goes.
Added "Middle Dot", char code 0187, "·" to the Latin-1 character
tool.
Added an "Edit" button to the stealth scannos window. Add, delete or
edit the regexes and/or regex hints for the currently open scanno file.
Everything is keyed off of the regex search term. Right now if you want
to edit a regex, you need to add it, then delete the original. If you
only edit the replacement term or hint, just press add when you are
done editing. Save will write the changes permanently to the disk.
Cancel will drop you back to the scannos window where you can use and
test any edits you have made. CHANGES WILL NOT BE SAVED TO DISK UNTIL
YOU HIT SAVE FROM THE EDIT WINDOW. If you want to revert, hit Cancel
then rerun Stealth Scannos. That will reload the regex file from the
disk.
Added a "term count" to the stealth scanno window to help you keep
track of how many regexes are available and where you are in the list.
Changed sort order of regex scannos to be in alphabetical order. It
doesn't really make sense, but at least it is guaranteed to be in that
order. Hashes are not guaranteed to be in any particular order, leading
to oddities in how the terms are presented from search to search.
Radically improved search for words with accented characters in them by
changing the border condition assertions used in the searching code.
Instead of using \b to detect word borders (which gave incorrect
results for accented characters,) I switched to (?!<\p{Alpha}) to
detect a leading border and (?!\p{Alpha}) to detect a trailing border.
These assertions work exactly like the \b assertion for unaccented
characters but unlike \b, will also work for accented characters.
Added some regex verification code to the regex search functions. It
will warn you if you are trying to search with an invalid regex
assertion rather than just silently failing.
☺☺☺Very cool news! After much
twiddling, hair pulling and muttering, I have gotten the \n newline
regex assertion working!☺☺☺
Finally an easy way to do those
',(?=\n\n)' => '.' search and replaces. It is
inelegant, slow, memory intensive and slightly erratic, but
it is 100% better than it used to be. :-) There are several
idiosyncrasies;
you cannot search reverse if
you have a newline assertion in your regex. (You can set reverse
search, it will just continue to search forward.) The highlighting may
be slightly off, especially if you start doing variable length \n
searches, but it is pretty close most of the time. You can not capture
newlines for replacement terms. That kinda sucks, but I don't see an
easy way around it (right now at least). The initial search
is somewhat slow, especially if there are lots of matches found in your
text. (I wouldn't recommend searching for '\n' unless you have nothing
better to do for a while. It will match EVERY line.) It slams memory
pretty hard too; if you don't have much physical memory, it may end up
swapping in and out to disk a lot. Never-the-less, this is
still pretty awesome!
The newline assertion activation really makes a lot of neat tricks
possible that weren't either easy or possible before. When the regex
engine searches for a term with a newline assertion in it, it does a
search that will match newlines with a "." (match any character) This
allows you to do neat little tricks to search across multiple lines
even for regexes that don't nominally use \n assertions. Just having a
\n assertion activates the special search function. Say you
wanted find matching pairs of guillemots « ». They
are not guaranteed to be on one line, and in fact are more likely than
not, not going to be. However, you can search using the \n assertion to
treat the file as one long string. Search for
'«[^»]+»\n?'. The \n in the regex is not necessary
for the search, however, it is necessary to activate the special
search. This will work for any single character searches. Search for
pairs of low lines; '_[^_]+_\n?', search for pairs of double quotes;
'"[^"]+"\n?'. And so on. A common scanno is "to he" instead of "to be".
A regular search will turn up any occurrences in one line. But what if
"to" is at the end of a line and "he" is at the start of the next?
Search for 'to[\s\n]he'. The drawback with this is you can extract
variables for replace if you use a newline, so it may be better to
break that into two searches, this is more for examples rather than
suggestions.
Removed the three "Blank Line" searches from the menus as they are no
longer needed with the \n assertion activated in regex searches.
Added option under the Prefs menu to shut off the audible bell on
warnings. It was starting to annoy me while working on the regex search
code. Setting will be saved and remembered.
Made search buttons flash on warning. Visible warning useful if
audible warning is turned off. I had to change the highlight color from
default gray, otherwise it flashes from gray to the same shade of
gray..... which isn't very useful.
Added option under the Prefs menu to change the highlight color of
buttons. When I added the highlight color, I chose a default that I
liked. Other people may disagree. Now they can change it.
Broke apart the $f internal variable for external module calls into
components as I speculated in the forums. Now have $d - directory path,
$file - file name and $e file extension. to get the equivalent of what
used to be $f all by itself, you now need to use $d$f$e.
Reconfigured the regex hint window to be a little more user friendly.
No longer need to close it every time between hints.
Added an "Auto Advance" option to the scannos window. When checking
stealth scannos, selecting auto advance will automatically cycle
through the search terms until it returns a successful search on the
text. If the search term is not found in the text, it will load
the next scanno and search again.
Version .33(251k)
Added a few new hot keys. Alt-right arrow
will indent selected text one space. Alt Left arrow will move selected
text out one space.
Added a whole bunch of menu shortcuts. Nearly every menu item can now
be run from the keyboard, (rather than needing a mouse.) Some functions
will still need a mouse to be used effectively, but you can at least
run the function from the keyboard. Press the Alt button to see the hot
letters stand out on the menu items.
Fixed obnoxious bug in search code where you could not search for a
zero. Was using a brain dead method to check if there was a search
term. It wasn't smart enough to know that zero ("0") was not equivalent
to null ("").
Exposed a few internal variable for use in calling external modules, if
desired. Right now I only have three. I can easily add more if they are
desired. The exposed variables are "$f"; the current open file name
with full path, "$i"; the (i)mage directory with full path and "$p";
the file number corresponding to the (p)age where the cursor is in the
currently open file. For example you can pass the name of the png file
of the current page to an program using the command:
"C:\some\path\program.exe $i$p.png" - Or pass the current
file to your default handler "start $f" (useful to view HTML files in
progress) -
Note: if you try to use any of these variables when they are not set,
you will get errors. IE, trying to use $f before you have opened a file
will not work. Caveat: For windows systems, you must use DOS friendly
path names for programs. The passed parameters will be automatically
DOSified, parameters you type in will not.
Added a button to the scannos search function called "Hint". This will
pop up a window with a brief description of the regex search term that
is currently loaded in the search window. The descriptions are located
in the same file that the scannos are in and are loaded at the same
time.There does not need to be a hint for each scanno entry, there can
be, but it isn't strictly necessary. I have edited the regex.rc
included
with the script file to include hints for all of the regex expressions.
If you want to add or modify any hints, please use the format
demonstrated in regex.rc. Any scanno file that can be used with the
stealth scannos search function can have hints.
Modified search function to allow you to search on a selection. In
order to make the function as unobtrusive as possible, I made it so it
doesn't reset the search when you reach the end of the searched
selection. Instead, it will beep when you reach the end of a selection
but continue searching past the end of the selection if you search again
Added another custom interpolation to the regex replacement engine."\A"
- Convert to an HTML named anchor. Used much like the \U, (upper case)
\L (lowercase) and \T (title case) interpolations. Do a regex search
for some text and do an interpolation on it. In this case, convert it
to a named anchor. For example, if you are working with a cook book
that has all of the recipe names in all caps, left justified and
want to make named anchors to hyper link the index or TOC. Do a search
and replace: '^(\S\P{Lower}+)$' => '\A$1\E' Or "Find all lines that
don't have a space as it's first character (not indented) and have no
lower case characters, (but may have punctuation accented characters or
spaces) and convert to a named anchor. (The \E means "End of
Interpolation") A named anchor looks like this: <a
name="ANCHOR_NAME"></a>. A named anchor can not contain double
quotes, spaces or accented characters. Those will all automatically be
removed/replaced. For example: say you did the aforementioned cookbook
search and replace. You might start with the text -
BUTTERSCOTCH PUDDING
Ingredients
How to cook
How to serve
CHOCOLATE PUDDING
Ingredients
How to cook
How to serve
VANILLA PUDDING
Ingredients
How to cook
How to serve
YORKSHIRE PUDDING FLAMBÉ
Ingredients
How to cook
How to serve
After running the regex search and replace, '^(\S\P{Lower}+)$' =>
'\A$1\E', you would have:
<a name="BUTTERSCOTCH_PUDDING"></a>BUTTERSCOTCH PUDDING
Ingredients
How to cook
How to serve
<a name="CHOCOLATE_PUDDING"></a>CHOCOLATE PUDDING
Ingredients
How to cook
How to serve
<a name="VANILLA_PUDDING"></a>VANILLA PUDDING
Ingredients
How to cook
How to serve
<a name="YORKSHIRE_PUDDING_FLAMBE"></a>YORKSHIRE PUDDING
FLAMBÉ
Ingredients
How to cook
How to serve
Notice the É was deaccented and spaces converted to underscores.
Very useful if the items needing named anchors can be found using
regexes.
Modified File open routine to automatically run Set Page Markers on
file open. It won't hurt if there are none, and runs so quickly that
the overhead is negligible.
Fixed External Calls menu building function to update correctly when
setting up calls. No longer need to exit and restart to see
modifications.
Version
.321(247k) Minor but
aggravating bug in file save code would sometimes not let you save
edits.
Version .32 (247k) Fixed
a few more spelling errors
in the UI. Sigh.
Made guiguts recognize the /$ $/ markup for compatibility
purposes. At this point, /* */ and /$ $/ are treated almost
exactly the
same. (I'm pretty sure I got all of the places where it will
matter. If not, I'm sure someone will let me know.) Added or modified
functions where appropriate to make the /$ $/ markup behave as
expected. At this point, the only difference is with /* */ you can set
a relative indent for wrapping and with /$ $/ you can't.
Implemented a "Last five files opened" history under the File menu for
one click opening of previously opened files. It has a little
intelligence in that it considers multiple opens of the same file as
one instance. IE, you won't end up with the recent files history filled
with pointers to the same file. I had to rewrite some portions of the
menuing code to allow finer control of display parameters. The original
method was very concise and compact, but did not allow granular control
over the menu display. The changes should not be visible to the user,
(except for the recent files list addition.) I've tested it a fair
amount and it seems to look and work pretty much like it did before.
Fixed some minor bugs in the file saving code which would sometimes
make it impossible to not save changes when opening another file
without closing the program. Wasn't really noticeable until the recent
files list became available.
Fixed an error in the "Block rewrap with parameters" code. When a block
rewrap with parameters was used with a block rewrap without parameters
(use defaults) after it, the values for the Block rewrap margins were
not returning to the defaults for the non parametrized Block rewrap
markup. If this means nothing to you, don't worry about it. It would
only bother you in fairly obscure circumstances. Thanks to Curtis W.
for finding the bug and for submitting a patch!
Fixed problem with page separator fixup routine where it would
mistakenly delete thought breaks if they were adjacent to a page
separator.
Added another Menu to the interface. External operation calls. It is a
user configurable menu of calls to external programs. You can set up
guiguts to call external programs from within the program. There are 10
slots that you can use to set up with external calls. Call any program
using the same parameters that would be used in the Windows
Start->Run box or at a command prompt. For Windows, if you have a
registered extension, you can start the associated program
automatically by using 'start [filename]' For instance to open a web
page using the default browser, enter 'start http:\\www.pgdp.net'
(without the quotes). If you are calling a program that has a space in
the path name, you must enclose the program name in double
quotes. IE, "C:\Program Files\Accessories\wordpad.exe". I have
included a few examples. Click on setup at the bottom of the External
menu to see/edit them. You can also edit the setting.rc file directly
if you prefer. Make a backup copy first though, if you chose to go that
route. Right now you can only make explicit calls. Eventually I hope to
have hooks to internal variables available too. (Current open file
name, current page number, current working directory, etc.) At this
point, when you make changes to the menu parameters, they will not be
updated on the interface until you close and restart the program. (They
actually WILL be active, but the interface will not change to reflect
it.) I am having trouble figuring out how to dynamically update the
menus. The code that worked for the recent files list fails miserably
here. I'll keep poking at it.
Fixed a bunch of broken links in manual. Did some more proofing and
editing and added some new material.
Version .31 (244k) Finally
got selection of different
dictionaries from
within guiguts working. Not as elegant as I would have liked, but it
works. Could not seem to get it working through the programming
interface. Finally gave up and am just writing changes to the
aspell.conf file. Will create one if it doesn't exist. (as a
consequence, spell check will need to be closed and restarted for
dictionary change to take effect.) Will
modify "master" line (master dictionary) and lang (language) lines if
it does exist. Should not affect any other settings in your aspell.conf
file.
Fixed several minor bugs in the Greek transliteration function.
(Actually one bug and several aesthetic problems.) Thanks the heads up,
Curtis!
Revamped and fleshed out manual a great deal. Added a lot more detailed
info, added a bunch of explanation for things I have gotten questions
about. Rearranged layout a bit. Added TOC with links. I've spell
checked it and read through it several times, but I'm sure there are
still errors. If you spot something (misspelling, wrong word, whatever)
please let me know. I figure, with this bunch reading it, there's a
fair chance that errors will be spotted. :-)
Added "Align text on string" function. Useful for aligning columns of
text that you want to align on a certain text string. IE., align
columns of numbers on decimal points, or contents lists on a
period. The default alignment character is a period/decimal point (full
stop) You can change it to any character or string of characters you
like. It will align on the first occurrence of the marker string found
in the line. If the alignment marker string is not found in a line, the
line will not be changed.
Tweaked a few of the scanno.rc regex expressions very slightly. Added
the v\b -> y assertion to the file.
Added some buttons to the gutcheck view to easily make bulk changes to
the gutcheck error view options; Hide all, See all and Toggle view.
Made a minor change to the scanno window file loading code. If you are
loading a scanno file that contains the string reg somewhere in the
name, it will automatically set the search window to use regex search
settings. If it doesn't have reg in the name, regex search will be
automatically unselected.
Changed all of the word frequency routines to be sortable both
alphabetically and by frequency. The main function was sortable both
ways, but all of the sub functions were only available with an
alphabetical sort. I had to make some subtle changes to some of the
functions to accomplish this, hopefully nothing that will be
problematic. The default sort order for all functions is now by
frequency (it is the word frequency routine, after all.) To change the
sort order, check Sort Alpha and press Re sort. Change back by
unchecking Sort Alpha and pressing Re Sort again.
Version .302 (229k)
Arrgh.
Stupidity fix to gutcheck filename
parsing code.
Version .301(229k) Minor fix to HTML image insertion code. Was
forgetting image directory every time an image was inserted.
Version .30 (k) Made changes in
the scroll wheel handling code
to try to get it to work correctly under Windows XP. It has always
worked under Win2K, WinMe, Win98 and Win95, but for some reason WinXP
handles scroll events slightly differently. Not having an XP system
myself, I never noticed that it didn't work. Hopefully this will
address the problem.
Fixed scroll bar in spell check replacement words list to resize
correctly to the length of the list.
Fixed HTML image code generator to use forward slashes instead of back
slashes.
Added update check function under help menu. Will connect to the server
where guiguts is hosted and compare your version with the latest
version on the server. Will pop up a message saying either your version
is the most current or that there is a newer version available. (Or
that it couldn't connect to the server.) Obviously, you are going to
need some sort of Internet connection for this to work.
Added a "stealth scanno" function to the word frequency window. It will
ask for a scanno list to use, (It needs to be a word list, the regex
lists won't work, at least not the way you would want.) then sort out
all of the words that appear in the list.Since the scannos can
theoretically go either way, both terms will appear in the word list.
(As long as the actual word appears in the text.) This isn't just
confined to lists of scannos either. You could use any list of words to
come up with matches as long as it is formated correctly.
Added another word list to use in the new scanno word frequency
function called misspelled.rc. It contains about 3500 of the most
common misscanned words. About 95 % of this will already be covered by
guiprep during preprocessing, but, hard as it is to believe, there are
some people who don't use guiprep. :-) This word list would be
lends itself very well to the word frequency scanno function but would
be extremely aggravating to use in the search scanno function.
Changed calling of search window from word frequency to pop it up on
top if it is already open. If a search window already existed, it would
not call it again, but it would also leave it covered by other windows.
Now it will pop up if you call it.
Fixed file open code to open the bin file (with the page markers in it)
if the filename is passed as an argument to the program.(Assuming it
exists.)
Fixed code to save settings file in the correct spot if a file name is
passed as an argument to the program.
Made some edits to the english-common scannos file distributed with the
program. Added a few, removed a few.
Version .29 (217K) Beefed up the ASCII box art drawing function
quite a bit to make it more adaptable and user friendly. Allows
customizable frame characters, justification, and selectable rewrap.
Added whitespace characters to the character count function. the
characters are represented by their names rather than by the actual
character (for what should be semi-obvious reasons). The searching code
has been modified to allow searching on whitespace characters. It
doesn't seem like the search for newlines works, but look at the bottom
line indicators, they change every time you search for one. Since there
is one at the end of every line (by definition), it is of limited use.
The tab searching is of value though.
Worked on the word frequency page Up/Down code, now moves correct
number of lines and moves the active selection fairly
predictably. Had to override some of the default behavior to get
this working. I'm pretty sure I didn't break anything else in the
process.
Worked a bit on auto generation of chapter named anchors during HTML
auto convert.
Fixed boneheaded problem with fixup routine with over enthusiasm in
"fixing" lst -> 1st "errors".
Version .28 (216K) Fixed a couple of warnings that were popping
up
occasionally when doing search and replace.
Found a bug in the regex search and replace function. When doing
variable length extraction of named property assertions, the
replacement extraction only works once. I have no idea why this is
happening. Everything seems to be working correctly, it just will not
allow you to use a variable length named property assertion more than
once. IE (\p{IsUpper}+) => \L$1\E will work, but only
once, and I can't figure out why. For now, try to avoid using variable
length named property assertions for variable extraction during regex
searches.
Worked on Auto List and Auto Table code in the HTML Fixup window quite
a bit, to make it a little more user friendly and intuitive. Lots of
little things added/changed. Too many to list (or remember.)
Added some a few more markup markup buttons to the HTML window
<small> and <big>
Added the optional modifier code to the /* .. */ markup like I
speculated about in the forums. /* .. */ markup with no modifiers works
like it always did, no rewrap, no indent. Markup with a indent
modifier, /*[4] .. */ will adjust the indent in the block
so that the left-most line will be set to have the indent specified and
all
other lines will be adjusted to keep their same relative indent. The modifier is an
absolute indent. Negative numbers will be ignored.
Changed the auto generated HTML to use text style span indents instead
of
padding with nonbreaking spaces. A little more elegant and easier to
read, especially for deep indenting.
Changed Gutcheck routine to automatically save the file if it has been
edited rather than just flash up a message about it.
Changed the sorting code in the word frequency - Alpha/Num check to not
list numbers with commas and periods since they are fairly common and
make the results less useful. (Too many false positives) Changed the
Check MiXeD CaSe routine to sort out all words that have lower case
letters and an upper case letter not in the first position. Makes it
much easier to pick out errors LlKE THlS.
Gave up on my attempt to make my rewrapping routine follow text layout
conventions regarding orphaned words at the ends of paragraphs.
In general, it is considered poor practice to leave a line less than 10
characters at the end of a paragraph. Typically, what is done is a word
is moved down from the preceding line to flesh it out. I had this
implemented, and it worked pretty well, but occasionally, more often if
the rewrap margins were set fairly low, (60-65), it would result in a
line that would be reported as short by gutcheck. Well, yes, it was
short by gutchecks standards, but it was carefully trying to follow
standard text layout convention. I have, however, grown tired of
explaining that that is indeed a feature, not a bug, and so, since
nobody seems to want it, disabled that part of the rewrapping code. Now
if you end up with one character left over at the end of a line, the
rewrapper will cheerfully put it on a line by itself. (Of course,
gutcheck will report THAT as an error too....)
Fixed minor bug where rewrap function would strip spaces from in front
of thought break if selection ended just before it.
Added a somewhat bizarre function to automatically draw ASCII art boxes
around a selected block of text. I was using it while post
processing a Punchinello issue, to lay out the advertisements more like
they are in the magazine. It works fairly well but I'm not sure that it
should be used, really. It makes it necessary that the text be viewed
using a fixed width font, and makes it difficult to rewrap. Still, it
is fun to play with, and makes it easy to do such things, if you are of
a mind to. The selection NEEDS to start and end on blank lines for
it to work correctly.
Version .27(213K) Made orphan brackets and markup search
function
recognize simple 1 level nesting. IE (Text like (this) will
pass.) (But text (like) (this) will still need to be checked.) I
could allow for unlimited nesting, but then missing or wrong
brackets would slip through fairly easily.
Made bracket search highlight both brackets it is questioning instead
of just the first.
Made the script remember the pngs directory from session to session as
long as the file has been saved. Since it is project specific, it is
saved in the bin file associated with a particular project file. As a
result, it will only be saved when the bin file is saved, which is only
saved when the project file is saved.
Fixed error when setting images directory through the prefs menu.Was
supposed to open the first image file in the directory (a quick check
that you had the correct directory) but would pop up a error that file
(path).png could not be found. Changed code so it should work as
expected
Modified code to save the window geometry (for the widows for which it
tracks the geometry) right after it gets a resize (or move) command.
There is actually a 300 ms delay so it doesn't try to continuously save
the settings as you are trying to resize/move the window.
Puzzled out how to allow navigation of listboxes using arrow and page
up/down keys. (I feel pretty stupid about this one. All I had to do was
switch the focus. Clicking on it with the mouse doesn't change the
focus oddly enough. It needs to be explicitly set. Or you can set the
focus by tabbing between the widgets until the listbox has focus, but
that little tidbit is not documented anywhere easily accessible.) The
drawback is that the current selection in now underlined, which I don't
really care for, but the benefits outweigh the negatives.
Experimented a bit with not sorting a scannos file if it contains the
character sequence "reg" in the title. Much as suspected, the results
come back in nearly random order. (Which, for the regex files, is not
much worse than alphabetical order.)
Fixed problem with rewrap function where it would sometimes lose some
of the page markers if text was heavily rewrapped. (For instance,
change the rewrap margin from 72 to 50 and rewrap.)
Fixed problem with rewrap function where it would eat a blank line if
there was an even number of blank lines in a row. It wouldn't change
anything if there was 1, 3, 5, etc. but if there were 2, 4,... lines in
a row, it would delete one. :-?
Fixed oddity where rewrap function would add an extra blank line at the
end of the rewrapped text if the selection did not end exactly on a
blank line.
Fixed problem where if selection did not begin on a line containing
text, rewrap function would delete one blank line before the paragraph.
Cleaned up a bunch of warnings in the rewrap routine. (mostly boundary
conditions on empty variables, either uninitialized or empty after
processing.)
Added "Auto Table" and "Auto List" functions to the HTML fixup window.
Auto table will put <table> </table> around the
selection, put <tr><td> at the beginning of each line in
the selection, put </td></tr> at the end of each line and
replace any instance of two or more spaces together in a line with
</td><td>. Auto List puts <ul> </ul> around the
selection, <li> at the beginning of each line and </li> at
the end of each line in the selection.
Added a few more Greek glyphs to the transliteration chart. They are
low usage but not terribly uncommon. For the most part they are not
available as HTML entities.
Version .26 (202K) Added
a function to reformat poetry line numbers along the right side of the
text. It will look for numbers in the rightmost columns separated from
the text by at least two spaces, then add spaces until the right edge
of the number is at the rewrap margin. Adjust the right rewrap margin
to change where they are placed. It will put at least two spaces before
the number, even if it makes the number exceed the rewrap limit, so it
can find it again if you choose the run the routine again.
Reformated the menu layout as discussed in the forums.
Changed the fixup routine to first pop up an option window so you can
select what fixes to run. You can also select whether to run the
routine inside /* */ marked blocks or not.
Started writing a routine to find orphaned markup, but decided to cheat
and just grafted it onto the bracket orphans search routine. :-)
Added a routine to find and remove blank lines before page separators
without actually removing the separator. This is a low usage function,
only certain projects will benefit from it, but the ones that need it,
will now have it.
There is a bug in .25a that prevents gutcheck from running. I have no
idea what the bug is or why it was getting an error. Version .25 worked
and this version works, and I haven't touched the code in that part of
the program lately. Rather then release a third version of .25, I'm
pushing up the release of .26 a bit.
Version .25a (200K) Fixed some fairly serious boundary condition
bugs in the block rewrapping parameter code that I hadn't taken into
account initially. Thanks to DaveKline for the the bug reports and
examples of failure mode text.
Version .25 (200K) Fiddled with the Internal links guessing code
a
bit more. When making internal links to named anchors, the window will
pop up a list of all of the named anchors in the text. If you are
hyperlinking an index, it can get pretty unwieldy. Added some code to
try to guess which link you will want and put the likely candidates
near the top of the list. If you name your anchors with this in mind,
it can work pretty well. It looks at the first word in the selected
text for the internal link and searches for named anchors that contain
that word. Works pretty well for indexes.
Worked some more on the Greek pop-up window. Added the capital vowels
with rough breathing marks. Added selection for what kind of mark up
the function produces; transliteration, letter names or HTML entities.
Add some new markup for the rewrapping function. If the rewrapping
function encounters /*..*/ markup , it skips over everything in the
block enclosed by the markup. Added /#..#/, similar to gutwrench and
rewrap-indent, anything enclosed in /#..#/ will be block indented the
standard block indent margins. If you put margin numbers on
the opening line, it will use those numbers for the margins instead.
They must be formatted thusly: ( /#[x.y,z] ) The first
number is the general left margin override. ( /#[x] ) It will indent
all of the lines x spaces. If a there is a period and a second number,
( /#[x.y] ), the first line will be indented y spaces and the
rest x. If there is a comma followed by a number, ( /#[,z] ), it will
override the default right margin setting. You can override the margins
in nearly any combination. If you override the first line (y) you will
need to have a x value, otherwise the y will be used for all of the
lines, and if you have both a left margin and right margin setting, the
left margin needs to come before the right. - /#[,z.yx] won't
work, at least not like you'd expect.
For example:
/#
Text text text
text text text
#/
will be indented and rewrapped using the standard block rewrap margins.
/#[6,53]
Text text text
text text text
#/
will block rewrap with a left margin of 6 and right margin of 53
instead.
/#[2]
Text text text
text text text
#/
will use a left margin of 2 and a standard block wrap right margin.
/#[4.6,70]
Text text text
text text text
#/
Will have first line margin at 6, the rest of the
lines at 4, and
wrap after column 70.
And so on.
This markup is available in addition to the block rewrap function, not
in replacement of it. It is really just a way of doing overriding of
default rewrap margins inline while running a standard rewrap.
Made the file open dialog see .htm and .html files also by default.
Because I want it that way and what I thinks carries a lot of weight
with the author. :-)
Fixed the File->Save As and File->Include functions to default to
the directory the open file is in rather than the directory the
guiguts executable is in.
Added a function to the Fixup menu near the Rewrap function to
automatically clean up the rewrap markup /* */ & /# #/.
Anything on the line with the markup WILL BE DELETED. The entire line
including the newline
will be removed, so leave a space before the open and after the
close markup. (the standard anyway.)
Added a Footnote format tidier for non HTML versions. It will reformat
the footnotes to be a little more aesthetically pleasing and rewrap
them, however, it will destroy the footnote markup so that other
automated tools will no longer be able to work with them. If you plan
to make an HTML version, SAVE THE FILE WITH A DIFFERENT NAME BEFORE YOU
RUN THIS, because this function will make the automated footnote
hyperlinking tool ineffective.
Version .24 (195K) Added a Latin 1 pop up chart under the Help
menu. It contains the bulk of the characters that aren't directly
available on a std US layout keyboard. The accent marks aren't
included, neither are the nonbreaking space & hyphen and a few
other obscure characters.
Added a pop up Greek transliteration chart under the Help menu. I uses
a very similar scheme for transliteration as the site, based on the
encoding used by the Helen project. It uses Latin-1
encoding rather than ASCII. The only real difference is eta ( η )
is encoded as ê rather than ae and omega ( ω ) is
encoded as ô rather than o. That makes it easier to distinguish
eta from alpha epsilon and omega from omicron. The upsilon defaults to
y rather than u since it will only be a u if it is part of a diphthong.
(Ζευς is Zeus,
not Zeys). I also put
buttons for the "rough breathing" marks. The rough breathing marks were
in essence, the "h" in Greek. They only occur over vowels (and rho) and
signal that there should be a h sound before a word. So ύδρα is
hydra, not ydra or udra. I considered implementing beta
encoding but dismissed that after
investigating a bit. Beta encoding is very good for having something
that
exactly records the original Greek, but is almost unreadable in the
transliterated form.
Trapped a warning in the rewrap routine and tried to find a reported
problem of falling into an infinite loop. Not terribly successfully,
I'm afraid.
Added button to word frequency window to re run the word frequency
routine. It will save the file first if it has been edited to get a
more accurate frequency count.
Added a "Find orphaned and nested brackets" function under the search
menu. It will automatically find all orphaned and nested brackets and
allow you to step through them to check their validity. Will work for
parenthesis ( ), square brackets [ ], braces { } and angle
brackets < >. It can take a while to do the initial search,
especially on the < > search in HTML marked up texts, so be
patient.
Fixed regex extracted variables to be usable more than once in a
replacement term. They were artificially limited to once per
replacement term.
Made default markup for image tags include the align tag for flowing
text around the image. Defaults to align="left", change to right,
center or whatever suits.
Added option to Internal links pop up selection window to sort
alphabetically. Possibly handy while hyperlinking indexes. (I wanted
it, so you got it.)
When doing internal links, if the selected link text matches one of the
anchor names, that anchor will be displayed at the top of the list.
(Handy when hyperlinking indexes to cookbooks where the index is
basically an alphabetically sorted list of all recipes in the book.
(Guess what I've been post processing.)
Added a few more search terms to the regex.rc file.
Version .23 (143K) Fixed very nasty bug with spell checking
function where spell check would skip a word every time you added one
to a dictionary, either the project or Aspell one. Thanks to martinag
for finding this and bringing it to my attention!
Remove key bindings for HTML header markup Alt-1 through Alt-.6 Was
interfering with adding high ascii characters using the alt keys.
Added another subfunction to the word frequency
window, Check Accents. Similar to the Check Hyphens function, will sift
out all of the words in the text that contain accented letters and
display them along with their frequency count. If a word is found that
is the same as one of the accented words except it has no accents, it
will be displayed too, marked with ****. The unaccented word may show
up
more than once in the list if more than one variation of accented
characters show up.
Added a second harmonic display function to the word frequency list
but, after tinkering with it for a while, removed it again. It ran very
slowly for longer words, (about 5 minutes for a 10 letter word on a P4
2 Ghz processor, during which time, it was completely locked up,) and
returned almost uselessly long lists for short words. (Search on any 2
letter word and you got back EVERY one and two letter words in the text
plus a significant portion of the three letter words.) Ah well, I
didn't spend that long implementing it so I was able to discard it
without a twinge.
Did some more tweaking of the Auto HTML generation code, particularly
with respect to Footnotes. Trapped some more potential formatting
problems.
Version .22 (142K) Fixed flaky found term highlighting. Another
consequence of the cut 'n paste fix.
Fixed the undo EVERYTHING bug in the search and replace function.
Something I was trying out didn't work too well. (Actually it DID work
too well. Removed now.)
Got the Save As function to save the bin files with the correct name
and in the correct directory. Should work correctly all the time now,
worked sporadically before.
Worked on a problem with the guess page numbers function not being able
to open page images with less than 100 or larger than 1000. Works much
better now and gives more indication of problems if it encounters
one.
Worked on the search window option selection code. The options would
get confused and eventually disabled if you had both the Search window
and the Word Frequency window open at the same time and were using
both. (Which is pretty much exactly what you want to do most of the
time.) After puzzling over the problem for some time, I figured out
that the GUI is pretty picky about how it will let you access variables
associated with GUI elements (check boxes). If I manipulate the
variables directly, the GUI disassociates the variable from the
element. It insists that I interact through the GUI hooks. Fine, but if
the GUI window hasn't been substantiated, the GUI hooks aren't
available and I NEED to manipulate the database directly. Sigh. What it
all boils down to is I added a whole bunch of if-than-else statements,
and the Search options are much more stable now.
Not really a bug or a fix, just a notice: If you have an extremely long
search or replace term (more than 40 characters), it will automatically
scroll the text over 39 characters and seem like the search term has
been chopped off. It hasn't. You can move back and forth with the arrow
keys. Yet another consequence
of the cut 'n paste fix (which I'm beginning to suspect was more
trouble than it was worth.)
Found and fixed some problems with case interpolations in regex
replacement text. Under certain conditions, the case interpolation was
being applied to the term after
the one specified in the replacement term.
Fixed the three Blank line searching functions to automatically start
searching again from the beginning if they reach the end of the file.
(They were supposed to before but they weren't.)
Worked on the spell checking program interface a bit. Got the
replacement term selection intelligence working. It will now learn from
experience and put words that are often chosen as the replacement for
a particular misspelling earlier in the replacement terms list
the next time that misspelling is encountered. I
have also enabled automatic entry of the top guess into the replacement
entry box. This will probably be wrong as often as it is right, but
even if it only right 10 % of the time, it is 10 % better than having
nothing in the replacement box at all.
Added another binding to
replacement words list box, Double click moves the word up to the
replacement term box. Triple click will automatically replace the word
in the text and advance to the next misspelled word.
Spent some time trying to get the aspell interface to allow you to
change dictionaries from within guiguts. Finally got aspell to admit it
has other dictionaries,
(assuming it does), but it is steadfastly refusing to change to them.
I'll have to poke at it some more....
Added still another function to the word frequency window. Harmonics.
This is a list of all of the words that are in the current text that
are within one edit of the currently selected word. For instance, if
you select "the" and click Harmonics, you might end up with the list
"he,
she, She, the, The, them, then, they, thy, tie". These are all the
words in the text that can be gotten from the selection with only one
edit. (Well, you get the original word back, so one or less... :roll: )
The edit can be a replaced letter, a removed letter or an added letter,
but there can be only one edit. You must have selected a word in
the word frequency window or it won't return a list. Different texts
will get different lists. It doesn't return every POSSIBLE
variation in spelling, only those that are present in the text. There
is a hot key shortcut - Ctrl-left click. The harmonics window has the
same search bindings as the word frequency window (left click to pop up
the search window, right click to search with the current search
settings. You can also recursively do a harmonic search on a word in
the harmonics window. You must use the hot key to do so.) The Harmonics
function is fairly intensive to run, it does a large number of
comparisons for each word, (259 * (# of letters in word) + 124) so
running the
harmonics routine on EVERY word in the text would take an unacceptably
long time. In practice running, on a word at a time is pretty snappy
and
probably more useful anyway.
Version .21a (139K)
Fixed problem in Search window where search text would disappear if you
use the hot keys to search.
Version .21 (139K) Fixed, or at least, worked on a problem with
the
spellchecking function where, under certain circumstances, it would
skip words that weren't correctly spelled, only to find them
later
in the text. Not sure if it is completely fixed, but after the changes
I made, I was no longer able to reproduce the behavior.
Fixed spellcheck function to allow checking of a selection of text
instead of the whole file. This was nominally implemented already but
had bugs and would just spell check the whole file no matter what was
selected.
Made the search function under the word frequency spell checking
function do a regex search for -misspelling|misspelling- when the word
is
found zero times in the dictionary. That happens because of the
different way aspell and guiguts treats hyphenated words. Aspell treats
hyphenated words as separate words, guiguts as a single word.
Made some changes to the search window entry boxes to try to compensate
for them not deleting selected text when you cut and paste. The changes
I made introduced their own set of problems that I think I have worked
around. Going to have to whack on the search window for a while to make
sure it behaves as expected.
Made the word frequency and gutcheck windows use the same display font
as selected for the main editing window. You may need to resize your
windows, or scroll a bit to see the results now, but I think the change
was worthwhile.
Added a character count subfunction to the word frequency window. Will
give counts of all non whitespace characters in the text.
Changed Save As function so it will save the markup .bin file with the
new name too. Be warned. If all you do is change the extension, the bin
files will collide and may cause problems. (Part of the reason I wasn't
too keen about implementing this.)
Modified the Save and Save As functions to only actually save anything
if a file has already been opened.
Version.20 (138K) Fixed a bunch of minor errors in the
bookmarking
functions. Bookmark highlighting works correctly now. Does not jump to
bookmark on file load. After working with the bookmarks for a while, I
found the bookmark highlighting annoying. Put a option under
preferences to turn it off.
If you have a bookmark set in the middle of a paragraph and then rewrap
the paragraph, the bookmark will be moved to the end of the paragraph.
I could fix this but it would just add more overhead to the rewrap
function which has quite a bit already, and is not really all that
critical anyway, in my opinion.
Found a bug in the HTML named anchor function, was not generating a
name
for the named anchor if you had selected text. Traced it back to a
change I had made to compensate for something else. Oops. Fixed.
Changed how the footnotes are formated in the auto generated HTML. they
look a little more aesthetically pleasing now, I think.
Fixed space in link names to footnotes.
Added sub function to autogenerate HTML function to convert all
characters from x80-xFF to named HTML entities, as well as
&,<,>, and ". The windows 1252 codepage characters are
converted as well. On large files or slow computers, may take a while
to run, be patient.
Worked on autogenerate some more to prevent errant markup at boundaries.
Fixed a minor error with automated HTML superscript markup.
Modified behavior of some of the HTML markup insert functions. All of
the header markup <h1>-<h6> will remove any paragraph
markup from the selection when applied.
Finally got disgusted with battling the built in rewrapping function,
ripped it out and wrote my own. Tried to make it behave as much like
the original as possible without the idiosyncratic indenting. There may
be some subtle differences that I haven't compensated for, but it is
pretty close. In general, the new function seems to work pretty well.
It will try to prevent "widowed" (extremely short) last
lines by stealing a word from the previous to pad it out if it is less
than 10 characters. I am not doing any line end smoothing, I
suppose I could mess around with it at some point to see what I can
come up
with.
Added an menu option under fixup to insert underscores around the
selected text.
Came up with a function to try to guess page numbers based on average
page length for people working on files that no longer have the page
markers in them. It will ask for some page and line numbers to try to
calculate an average page length. Avoid selecting pages for the
calculation that are not in the body of the text (contents, index.)
They tend to have wildly different page lengths and will throw off the
calculation. This is only going to be somewhat accurate for texts that
have mostly the same number of lines on each page. The more the text
varies, the further the calculated pages will differ form actual.
DO NOT USE THIS FUNCTION UNLESS YOU HAVE NO OTHER OPTION. Or at
least, don't save the changes.
Fixed a few spelling errors in the UI. If you didn't notice them, too
bad, I'm not telling you where they were.
Figured out and fixed problem where manual wasn't opening under
winguts. (I think)
Version .19 (131K) Added
capability to add fonts to the display font list. Will retain them once
entered. If you want to delete them, open the setting.rc file and
remove the ones you no longer want. (You can add them in the setting.rc
file too if you want.) Ariel, Courier New & Times New Roman are the
defaults and will be re added even if deleted from the file. (Under
most circumstances, at least. You can make them not appear but, what's
the point?)
Made a link to the HTML manual (here :-) ) under the help menu. Should
work under all versions of windows. Probably won't work under Linux,
but I have no system to test it on. Turned out to be much simpler than
I expected, although I was getting pretty far afield before I figured
it out.
Added a new menu item and a whole raft of hot keys for bookmarks. Mark
and jump back and forth between up to 5 spots in your text.
Control-Shift-(1-5) sets a bookmark and Control-(1-5) jumps to that
bookmark if it has already been set. Bookmarks can also be set
/accessed through the menu. They will be saved from session to session.
They can be reused, just set it in another spot to reuse it.
Wherever applicable, inserted shortcut hot keys notation next to menu
item in all
menus.
Have enabled highlighting for zero width search results. Kind of
cheesy, I'm just highlighting the character after the zero width
assertion. In practice, it works pretty well. Search regex for ^$
(blank line) to see an example.
Worked on trying to come up with a work around for the lack of a
newline assertion for the regex search. After about 10 hours of work,
came up with something that would work about one third of the time.
Another third, it wouldn't find what you were looking for, and the
remaining times, it would just lock up the computer completely (once I
managed to spontaneously reboot my computer too) Gave up in
disgust and removed all the code again.
Added three more single function search items to the search menu:
search
for two consecutive blank lines, search for three consecutive blank
lines & search for four consecutive blank lines. (To find single
blank lines, just use the regex search function with the search
assertion ^$.) This will probably cover about 50 % of what the \n
assertions would be useful for.
Added a new replacement text assertion: \T. Similar to \L & \U,
this will adjust case in the replacement text. Whereas \L will set to
upper case and \U to lower case, \T will set to title case. (First
Letter Of Each Word Capitalized). This is not a standard regex
assertion, but it allows you to do things that would not otherwise be
easily done. An example: search for (CHAPTER) and replace with \T$1\E
will yield "Chapter".
Uncovered a bug in the case assertions while I was implementing the \T
assertion. Would not let you use a case assertion in the first position
of the replacement string. It would just delete the text and not
replace it with anything. Fixed for all case assertions.
Version .18 (131K) Fixed
a bug in footnote moving function where if there were no footnotes for
the last landing zone, the script would silently lock up and not move
anything.
Add \n variable interpolation to regex replacement term. If there is a
\n in the replacement text, it will insert a new line at that point.
Fixed a couple of instances where the search window could get set to do
both regex and whole word search at the same time, leading to
unpredictable searches.
Added a function under the fixup menu to set the invisible page markers
before page separators have been deleted. The page join function will
still set markers if they haven't already been set.
Have figured out a way to do non blocking calling of external programs
in the compiled executable version. Had figured out a workaround many
months ago for the script version, (actually for guiprep,) but was not
able to get it to work with compiled version. (I'll have to backport
this into winprep too.)
Since I can now execute external programs, I have put in hooks to an
external image viewer. If you have set page markers, (or removed page
separators, which amounts to the same thing,) the page number will
appear in the bottom status bar, along with a button that will open an
image viewer to the image file corresponding to the current page. It
defaults to looking in a "pngs" directory one level below the directory
the project file is in, however you can change the directory it looks
in for the png files. The image file directory is not sticky from
session to session. It will ask each time you restart the program. It
will, however retain the directory for a session, once it has been set.
You can change the paths to both the image viewer and the pngs
directory under the prefs->set file paths menu. When you set the
images path through the menu, it will attempt to open the first image
file in that directory.
Version .17 (127K)
Changed some of the menu items around as suggested in the forums. Moved
the search and highlighting item to under the search menu. Made all of
the menus tear off. I don't think it is particularly useful for some of
them, but hey, now you have the option if you want to. :-) It only took
about 25 seconds worth of coding so I was amenable.
Worked on the footnote parsing function to make it try to recover from
minor formatting errors a little more gracefully. Will search for and
correct the most common misspellings of Footnote I've come across:
Fotonote, Footnoto and footnote (lowercase F) It will also assume that
if it can't find a colon within twenty characters of the end of the
word
Footnote, that the colon has been omitted and will place one at the end
of Footnote. This will allow for up to 19 digit footnote numbers,
should
be enough for most books. :-) The missing colons were not so much
causing
problems with the footnote moving routine as the automated generation
of footnote links during HTML autogenerate where it would just silently
and mysteriously fail. It relies on the colon to help parse the
Footnote number (letter).
Fixed some rather bad bugs with landing zone handling when using more
than one landing zone in a text. Changed a bunch of things and added
error checking to make the whole process a lot less fragile. It was
very easy to make it not work if you didn't do things in a very
specific order. Made it a lot more forgiving of "out-of-sequence"
operation.
Reworked the layout of the footnote moving tool window a bit. It was
pretty
sloppy and not very easy to figure out what some of buttons did.
Added a couple more buttons to the footnote window to automatically
insert landing zones at the end of each chapter or at the end of the
text. The auto insert functions will remove any existing landing zones
before adding new ones. (So if you want to remove all of the LZs, click
on Auto End LZ which will remove all but the one at the end of the
text, them remove that one manually.) The chapter end auto insert LZ
function is
rather simplistic. It looks for 4 blank lines in a row and assumes that
it is a chapter break. ( the standard layout for chapter breaks, so not
a big stretch.) It will skip the first 200 lines of the text to avoid
putting footnote LZs in the title page or contents page. If you have
an especially long contents or preface, you may end up with some
unnecessary LZs. Don't despair, it will automatically remove any
landing zones that haven't been used after it is done moving the
footnotes.
When the footnotes are moved, the script will attempt to move the
anchors against the text they are referring to. The anchors are often
spaced or have a line break between them and the text they refer to.
This just automates the fixup so you don't have to go back do as much
manual tweaking.
Putzed around with the HTML generation code some more. Tweaked the
footnote layout a bit. Using blockquote tags to set them in a bit,
probably would be better done with CSSs instead but it can be changed
when necessary. Worked quite a bit on the auto generation function. Got
it reacting fairly predictably. Fixed a bunch of border effect errors.
Shouldn't auto generate orphan markup anymore.
Added more markup to the detect orphans function. Will now check nearly
all standard HTML markup instead of just i and b. Of course, it's a lot
slower now... :-(
Made the auto generate function automatically handle subscript and
super script markup. _{xx}will be changed to <sub>xx</sub>
and ^{xx} to <sup>xx</sup>This is the only destructive
change that the auto generate function makes. Everything else can be
backed out of by selecting the whole document and hitting "Remove
markup from selection". (Will leave italic and bold markup.) It will
take a while to chug through the file....
Added a button to the HTML popup named "Poetry". It will
automatically add non breaking spaces to preserve indenting and
insert line breaks after each
line of the selected text with one button press.
Got annoyed with some of the idiosyncrasies of the rewrap function, so
I whacked on that for a while. Think I've got it to a point where it is
not going to go charging off in odd directions too often anymore.
Version .16 (127K) Added
a function to the word frequency
routine to
sift out and display
all Mixed Case words. These will primarily be initial caps words, but
it will also find words with caps in the middle of the word. (It will
not display words that are ALL caps.)
Added another function the the case adjustment functions; automatically
convert selected text to title case. (This was actually already active
in version .15, but I forgot to mention it.)
Added some markup shortcut keys for use when
generating HTML versions. Hot keys Alt-1 through Alt-6 will insert
markup <h1>..</h1> through <h6>..</h6>
respectively around the selected text.
Added a bunch of unlikely character combination checks to the regex.rc
list.
Added quite a bit of functionality to the HTML function under the fixup
menu. Made a button bar which has most of the popular HTML markup on
it; at least, all that is easily translatable to TEIlite. (Figured,
"why make life harder for myself later?" :-) ) Will automatically
insert the selected markup around the selected text. Some markup
buttons act differently depending on what text is selected. There is an
Autogenerate HTML that will do basic conversion to an HTML version. I
still haven't thought of a good way to parse the title page and
contents and automatically mark them up so the generated file will
still need some tuning. I am quite pleased at the automated generation
of links to out-of-line footnotes though. :-) I already had all of the
code in place to parse the footnotes, so it wasn't all that difficult
to implement. I am including a file called "header.txt" that has the
basic HTML header information it uses to make the header to the HTML
file. It is very basic, pretty much the absolute minimum to be valid
HTML. You can edit it however you like, if you want some custom
features. I am not using cascading style sheets at this point, though I
am leaning in that direction for the future. I'd like to get some other
opinions and suggestions before I go there.
This is not a very high end HTML editor. For simple texts it is
probably sufficient, and it will generate something that can be further
tuned in a more powerful editor if necessary. It will automate a bunch
of stuff that would be very tedious in a standard HTML editor though.
Version .15 (121K) Added a function to the word frequency
routine
to find all capitalized words. Unlike PRTK and Gutwrench, I include
single character words because it will only grow my list by a maximum
of 26 words due to the way they are presented. This is a function that
I don't find particularly useful, but other people seem to, so I added
it. (Besides, it only took about 5 minutes to do it. :-) ) BTW, if you
sort case insensitively and then search for ALL CAPS, you won't find
any. This shouldn't be too surprising if you think about it.
Found an error in how warnings were set up, (actually, it was pointed
out to me,) was preventing warning from being raised. Combed through
code fixing loads of minor
errors (warnings) that were not fatal but could lead to obscure bugs
occurring while running. Fixed a bunch of stunningly bad code that I
was
getting away with 'cause it sorta worked and no one had called me on it.
Added some options under the Prefs menu item where you can set paths to
the various support programs. (gutcheck, aspell.) These were accessible
in other places but this collects them into one place where they might
be expected to be. Also moved rewrap margin setup to under Prefs.
Added few variable interpolations for regex replace. Have enabled the
\L, \U & \E assertions for replacement text. Text surrounded
by \L and \E will be lowercased in the replacement text. Text
surrounded by \U and \E will be upper cased. If the \E assertion
is omitted, all of the text after the \L or \U assertion will be lower
or upper cased respectively. Usually will be used to change the case of
extracted variables. An example: Search for <(\/?)(\p{IsUpper}+)> and
replace with <$1\L$2\E>
will change any upper case HTML markup to lower case. Any instance of
<I>, </I>, <B> or </B> will be converted to
lower case. (Useful for XHTML. HTML is somewhat blasé about the
case of its markup, XHTML is much more finicky)
Worked with donovan to get guiguts to run correctly under Linux. He
also was invaluable in helping track down some of the more obscure
warnings. Thanks donovan! Linux compatibility is about 80-90 % there.
Still need to work out some odd things here and there.
Version .14(119K) Fixed rather serious problem where aspell
personal word list would be corrupted when guiguts exited. Only seemed
to be an issue with Win 98; Windows NT and 2000 (and presumably XP)
didn't seem to have the problem, or at least, not as bad.
Tweaked page separator routine a little.Auto join was failing if last
character of line prior to page separator was upper case. It was
pointed out that that was
unnecessary. Changed. Modified logic to allow the line after a page
separator to start with "I " (capital I space) without faulting and
needing user intervention. Normally, it fails to autojoin if the line
after a page separator starts with a capital letter, since it is not
uncommon that page ending punctuation is missed. "I" is so common and
occurs in texts so frequently, that it is probably one of the biggest
false negatives. Added it as a special case.
Fixed routine history to only record page separator routine when first
invoked, instead of each time it removed a separator. Arrgh...
Added a few more regex expressions to the regex .rc file.
Tweaked a few other minor user interface bugs.
Version .13 (119K) Fixed a problem with the Footnote Fixup
routine.
It was not dealing with double open or closing brackets very well. They
are rare, but they do
happen.
Made up a new scannos list containing some useful regex search and
replace terms, called regex.rc. It works like the other scannos list
except the you'll need to have Regex checked in the search window when
you use them. There aren't many in there, just a few I thought of off
the top of my head. If anybody comes up with any other useful regex
search expressions they think should be in there, let me know and I'll
add them to the distribution.
Made some change to the Footnote re indexing routine. Makes more of an
attempt to preserve the original style anchor markers (letters, Roman
numerals) instead of changing them all to numbers.
Tweaked various word frequency display lists (Frequency sort.
Alphabetic sort, Hyphen sort, Alpha/Numeric sort, Spellcheck sort) to
be more uniform in how they handle searches. Most of the changes are
behind the scenes and not visible the the ordinary user, but I know they are there.
Added a Sidenote fixup function. Will find all Sidenotes marked with
"[Sidenote" and move them to just before the paragraph they are in. It
will leave 1 blank line before and after each Sidenote. This will allow
the text to be re wrapped without folding the sidenotes back into the
paragraph. (And besides, I like it better that way.) It will also do
some basic error checking and alert you if it finds something it thinks
is a sidenote but has bad markup. This hasn't been a high demand
function, but I've got a
text to post process with about 18 bazillion (loosely defined as 283)
sidenotes in it and I didn't
want to deal with them manually. So there.
Fixed minor problem where stealth scannos directory wouldn't be
remembered if stealth scannos function was run more than once per
session.
Version .12 (117K) Added some more functionality to the Footnote
fixup function. Separated first pass and re index buttons. Added
buttons to allow you to switch view from the footnote to the anchor
with one button press. (Useful for footnotes after they have been moved
out-of-line.) Added option to do unlimited search for anchors. Was
artificially limited to searching only the previous page (more or less)
to prevent getting 50 footnotes all pointing the anchor point [1].
After the footnotes have been reindexed and moved, however, was
preventing script from finding anchors. It should only be used when
searching on footnotes that don't have any duplicate anchor markers to
a fairly high confidence level
Found and fixed bug where script was sometimes skipping adjacent
footnotes. (less than 2 characters between the end of one and the start
of another.)
Added spell checking functionality to the word frequency function. Will
filter the list to only show words that the spell checker doesn't
recognize. You may end up with words in the list with a frequency of
zero due to the different definitions of what a word is by the word
frequency routine and the spell checking program, typically, part
of a hyphenated word.
Finally got regex variable extraction for replacement working. Still
not
perfect, but not bad. Will support up to 8 replacement variable
extractions. Use standard regex syntax: surround the match variable
with parentheses and use numbered back references in the replacement.
IE: for the string " [12] " you could match \[(\d+)\] and replace with
[Footnote $1: ] to end up with the string " [Footnote 12: ] ". I am not
doing a true regex replace so you don't need to escape meta characters
in the replacement text. The search is
a true regex search so meta characters must be escaped. Short list of
meta characters - "{}[]()^$.|*+?\-". Any of these characters need a
backslash before it if you want to search for the literal character.
Regexes are extremely powerful and useful to do variable search and
replace operations. There have been whole books written on regexes so
I'm not going to try to cover them in great detail. Some of the more
complex ones look more like line noise than a search function. A decent
basic
tutorial is in the perl documentation. See http://www.perldoc.com/perl5.8.0/pod/perlrequick.html.
There are a bunch of regex search and replace expressions
that will be useful over and over. (Only the expression within the
double quote marks.)
search - "(\S)\s\s(\S)" replace - "$1 $2" -- Find exactly two
spaces between any non space characters and remove one space. Will
ignore indenting and Multi space strings.
search - "\.(\s\p{IsLower})" replace - ",$1" -- Find period
followed by a space and a lowercase letter, replace with comma. Will
get lots of false positives.
search - ",(\s\p{IsUpper})" replace - ".$1" -- Find comma
followed by a space then an upper case letter, replace with a period.
Will get lots of false positives.
search - "(?<=[^\-])-{4,}" replace "----" -- Find
a string of hyphens at least four in a row, preceded by something that
is not a hyphen and replace
with a string of four hyphens.
--and many, many more. That last one is not specifically useful, it is
more
just an example of what kind of neat tricks you can do.
There is now a Function History pop up window available under the Help
menu. It tracks the major functions as they are performed and keeps a
record of them. It will be saved session to session as long as the file
is saved. In other words, it will only retain records of functions
performed that have had the file saved afterwards.
Version .11- (116K) Added another button to gutcheck window.
Allows you to easily rerun gutcheck without switching back to the main
window.
Changed the the function of Del button in the page separator routine.
Now completely removes line instead of just clearing it.
Footnote moving tool has been activated. It's complex, ugly, limited,
buggy and much less automated than I originally planned, but it's a
start. Right now, it works best with simple footnotes. Nested footnotes
(the bane of my existence) are not supported very well. (You can do it,
you just need to be very careful.) I plan to do more with this, but
I've already spent over a week just trying to get this one function
working and I 'm sick of pounding my head on it. Though the concept is
simple, there is an amazing amount of fancy dancing that needs to go on
behind the scenes to keep everything straight.
Version .10 - (110K)
Modified gutcheck view options to only allow one instance to be
created. Made closing gutcheck window destroy the options window as
well.
Fixed bad subroutine call in "Change All" routine under
spell check function. Caused it to quietly fail.
Frame work for footnote moving and checking tool is implemented. It
doesn't really do anything useful yet, but it is fun to play with. :-)
Modified executable version to get around problem with aspell blocking
and failing. It will now open a console window. (DOS box) You shouldn't
have to do anything with it, although you may need to close it
separately after you close winguts. It is necessary to have it open for
the program to communicate with aspell.
Version .09 - (109K) Got
interface to Aspell/Ispell functioning. Either can be used, though
aspell seems to be slightly better supported under windows and seems to
be the
more capable spelling package. Now able to just spell check with
replacement suggestions a selection or the whole file if no selection
is made. Have enabled function to allow you to add words to the Aspell
dictionary. Still have to puzzle out how to allow you to select
dictionaries in aspell. Added "Change All" button in Spell check box to
allow you to change all occurrences of a misspelled word with one
button
press.
Added a "project dictionary" function. Allows you to skip project
specific words. Use it for words that are common in a project, but you
don't want to add to your standard dictionary. (Dialect, proper names,
etc.) Project dictionary is saved in the directory the text file is in
with the same name, but a ".dic" extension.
Tied Word Frequency information into spell checking function. If you
have run Word Frequency before you do spell check, it will display how
many times that particular word occurs in the text to help decide
whether it is a misspelling or not. **Note: the case of the word must
match in both the spell check and word frequency list. You probably
won't get useful results if you do a case insensitive sort in word
frequency.
Added alphanumeric check to word frequency window. Allows you to easily
check all of the words with mixed digits and non-digits. It will
include numbers that have commas and/or hyphens. I debated filtering
them out, but during testing, I found several problems with dates in my
test project that way, so I elected to leave them in. Makes it very
easy to find "H0ME" and "a11" errors as well as "l87I" and "l9OO",
which are
sometimes missed by a standard spell checker.
Made some of the gutcheck error searching routines a little more
robust. Should set the cursor exactly at the error more often. Still
gets thrown off by HTML markup, especially markup adjacent to the error.
Trapped another potential problem with rewrapping mangling the hidden
page markers. If you interrupted rewrapping, it was losing most of the
page markers and strewing cedillas (don't ask) throughout the text. :-(
Fixed now.
Worked on regex search highlighting, still not perfect--but much better.
Fixed minor bug in Replace All function of search box. Was not
replacing first occurrence when Replace All was selected.
Added selectable view options to gutcheck window. Only see the errors
you want. If you right click on an error in the list it will be deleted
from the list and will not be recoverable unless you run gutcheck again.
Version .08 - (105K)
Fixed bug where bin file was not being
saved
on program exit.
Added ability to change Font, size and weight in main editing window.
Very limited set of fonts available right now. I could probably add
more if is desired. The font information is for viewing only. There are
no formatting changes made to the text files.
Have spell check partially functional. If you have aspell or ispell
installed on your system, it will search and find all of the
"misspelled" words in the file and let you cycle through the file and
check them. The replacement choices option is being resistant to
implementation. I decided to release this as an interim since I figured
half a
loaf is better than none. You also can't add words to the dictionary
yet. It will get there.
Fixed problem with rewrapping mangling the page index markers. It is
not perfect but will generally remember the correct page to within one
word of the original page breaks, especially for words that were broken
across pages.
Version .07 - (102K)
Fixed
bug in Search function dialog box, was not updating "number of times
term found" label until a search was performed.
Fixed fairly serious
bug in rewrap / block rewrap function which would cause it to randomly
fail if the selection did not start and stop exactly at a blank line.
(Notice: rewrap functions still work best if the selection starts and
stops exactly on a blank line. If you insist on rewrapping a
selection that terminates in the middle of a line of text, the results
may not be exactly what you expect or want).
Fixed a subtle and obscure bug with page separator function. If there
were two page separators in a row (from an illustration or blank page)
with a word hyphenated across both separators, it would yield odd (read
wrong) results.
Tweaked separator search so that first few separators will scroll to
the center of the screen. It was not centering the separators until you
were further into the text than twice the height of your text window.
There is a very minor bug when converting the very first page
separator. It will always add one more blank line than you request. I
probably won't fix this since it is so minor, it is just something I've
noticed.
Added a "Delete" button to the page separator dialog. Deletes page
separator without making any other edits. Added a pop up help screen
too to explain the different page separator editing functions briefly
and listing the hot keys.
Fixed undo buffer to reset on file save. It was not, and I left it that
way for several versions but if you start doing major editing (rewrap),
the undo buffer starts chewing up large amounts of memory and slowing
down your computer. It will still do that if you don't save
periodically, but that is probably A Bad Idea™.
Added some hot keys to the Search function. Enter will search.
Shift-Enter will replace. Ctrl-Enter will replace and search.
Ctrl-Shift-Enter will replace all.
Added -v verbose switch to gutcheck options. Was enabled by default,
now you have a choice.
Got the full automatic header replacement working rather well, after
much dithering and fruitless experimentation. While full automatic
header replacement is being done, the Undo button reverts to doing
single step undo. It was very problematic trying to predict what
exactly you would want undone in that situation, so if you want to undo
something while in full automatic, you'll just need to single step back
through the undo buffer. I am debating adding some more automatic fixes
but want to get some feedback first.
Cobbled up a method to track page numbers after page separators are
removed. An ugly, nasty kludge... but it seems to work fairly well.
Page numbers will now show up in the bottom status bar if available.
Need to come up with some way to save the now invisible page markers
from session to session.
Debated various things and finally settled on writing a hash of the
page markers and indicies to a separate file rather than trying to keep
them in the same file. The cons, need to have two separate files, one
with the text info and one with the markup data. The pros; able
to use the existing tools without modification to skip the binary data.
Separate files win. This actually may be a good thing as I can start to
keep indicies to HTML markup in the separate file and only substantiate
it when you want to generate an HTML version. Anyway... the page marker
information will be written to a file in the same directory as the text
file with the same name but a ".bin" extension. It is actually just a
text file, and you can open it an look in there if you like, but be
VERY cautious about editing it. It is pretty sensitive to correct
formatting. As long as the ".bin" file is found in the same directory
as the text file, it will load when the text file loads.
This may be something we want to start saving with the archive files
since it can be used to reconstruct the text layout long after it has
been post processed.
Version
.06 - (99K)
Found and fixed a few bugs in the fixup
routine. Under certain circumstances, it would leave whitespace at the
end of a line. Fixed.
It was having problems with strings of hyphens
longer than three in a row. Fixed.
In the rewrap routine under certain
conditions, it would either trample the text after the selection or not
rewrap the entire selection. (Usually only extremely long paragraphs or
extremely long lines that needed lots of rewrapping) Fixed.
Finally
tracked down a fix (or at least a work around) for the scrollbar not
resizing in proportion to the text list size in the various popup
windows. Not real elegant, but it works.
When using the search
function, it will no longer display the number of times a word is found
in the document if "Whole Word Only" is unchecked. This let to some
confusing disparities between how many times a term was actually found
versus how many times it was reported.
Added another failure detect
mode to the HTML orphan markup detection.
Totally overhauled and changed the Page Separator Removal function. Now
is interactive much like PRTK. It was too problematic trying to make it
fully automatic. Now, a small window with several option buttons will
pop up. It will automatically search for and highlight the next page
separator and wait for you to make a decision on how to handle
it. There is also an option to save a marker in the text with the page
number as an HTML comment. Unfortunately, this will drive Gutcheck
batty as it will generate 3-4 warnings for EACH.... :-( I'll have
to figure out some tap dance I can do behind the scenes to make this
more usable. As of now it is available but not recommended.....
Added a regex search option to the search popup. It is subtly broken in
three or four ways but it is as good as it is going to get.
1) you can't match a newline (\n) character.
2) it won't perform matches across line boundaries. (Actually 1 & 2
are the same problem stated two ways)
3) accented characters will not match \w or \b assertions.
(Sorry, but that's the way it is. These are all flaws inherent in the
Tk
text widgets and there's nothing I can do about it)
4) found term highlighting is broken for regex searches, especially if
you start using a bunch of variable or zero width assertions. It will
highlight something, but it
may be more or less than the actual matching text. I'm tempted to just
turn off highlighting for regex matching, we'll see
Broke out "Remove end-of-line white space" from fixup function to let
it be run separately if desired. (It is still in fixup too, this just
lets it run separately)
Updated Manual.
Version .05 - (97K) Fix one thing, break something else,
:-\
When I did away with my previous file name parsing, it broke the check
to see if the file was edited before running Gutcheck. Fixed.
Added another button to scannos search dialog to swap the search and
replace terms so you can easily do reciprocal searching. Made searching
better able to deal with accented characters. Still some
bizarness though. For some reason I am not able to make perl detect a
border of a word if it begins or ends with an accented character. Makes
it impossible to do a "Whole Word Only" search for those. I've done a
kind of half assed work around by detecting if a word starts or ends
with an accented character, then doing a pattern search instead of a
whole word search in those cases. I am talking with the maintainers of
the perl Tk text widgets to see if I can get this working correctly.
Made a few minor bug fixes to script; update line/column indicators
when a gutcheck error is selected, added <ctrl>-s to the hot key
bindings. <ctrl>-s will save the file. Fixed a few errors in
documentation. I've added the extension ".ggp" (guiguts project)
to the default search list in the file open dialog. If you are running
the windows executable version, you can register the ".ggp" extension
to be opened by the winguts.exe program. That way you will be able to
open winguts by double clicking on a file with a ".ggp" extension. Made
search term entry take focus when Search & Replace box opens.
(Thanks to martinag for the suggestions!) Trimmed some unnecessary
files
from
the executable build, drastically reduced file size.
Version .04 - (96K) Added some more function to the search /
scannos functions. If word frequency has been run before the
search function is called, it will display how many times a particular
search term appears in the text. Note that this is only accurate for
whole words. Searching on a part word or punctuation will be labeled as
"not found"; not because it doesn't exist in the text, but because it
doesn't exist in the word list.
Added ability to search text either forward or reverse, The
search will automatically wrap around to the beginning (or end) of the
file when it reaches the end (or beginning).
Added a pop up hot key listing under the help menu.
Worked on filename / path parsing for converting to DOSish format for
gutcheck. Gutcheck is still just a DOS program so it can't parse file
names with more than 8.3 characters or pathnames with spaces / more
than 8 characters. Poked around in some of the more obscure perl
documentation. Think I've got it now. Fixed problem throughout program.
User requests, gutcheck box stays on top when focus change to main
window.
Added file filter on file open dialog to default to .txt files
Version .03 - (94K) Fixed
problem where file name would not be
updated in header until you clicked in the text box. Twiddled around
with fixup routine, sped it up by an order of magnitude without
sacrificing function. :-) Now it's only about 20% slower on average
than it would
be without updating the display at all. Made some changes to word
counting routine to recognize numbers with commas or periods in them as
a single entity. Change file open routine to remember the last
directory session to session. Added a couple more hot keys. Added a
hyphen check function to the word frequency window. Shows a list of all
of the hyphenated words in the file along with any words that are
identical except without a hyphen. Added a basic stealth scannos
checking routine. Grafted it onto the search and replace function.
Basically, it allows you to load a file of stealth scanno pairs and
automatically load them into the search and replace box one by one.
I've included Big Bill's English Common Stealth Scannos list from the
CVS site, formatted to work with the script. It is in the scannos
directory, under the script folder. You can add others if you like.
Fixed problems with windows that would only open once. Lots of
miscellaneous tweaking and tuning. Updated manual.
Version .02 - (90K) What version .01 should have been before it ever saw
the light of day. Better calling of gutcheck routine. Better parsing of
gutcheck error file. Better coupling of gutcheck output window with
text window. Vastly improved search function with text highlighting.
Made Gutcheck options sticky session to session. Lots more fix up
functions and fixed a lot of bugs in the existing ones. Added Word
frequency / index function to count words and frequency with direct
ties to search function. Completely overhauled menu to work around bugs
in perl text module. Now include the latest version of gutcheck with
the script. Wrote a manual.
Version .01 - (6K) Initial release. No manual. Partial
functionality. Flaky operation. Basically a fetid pile of crap.... but
it runs... sort of.