|If you use any MSWord compatible word processor in Linux the chances are
extremely high that you are using software that Caolan wrote, the infamous
MSWordView library (which has recently been renamed to
wvware). Caolan is a member of the Irish
Linux Users' Group and graciously allowed us to be the first to interview him.
Some guy called Caolan asked B^>:
- what editor do you use?
Vim, but its not a religious choice. The first manuals I got for unix
had the vi chapter before the emacs one :-). Though I really got into
vim when on co-op I had an awesomely slow connection back to skynet and
vim was far easier to use over the link.
- Do you use revision control systems of some kind, and have you any
serious thoughts on the matter?
The abiword project uses CVS and I use CVS there. Apart from that I
have little experience with it. I do think that there are some
serious flaws with using CVS for globally distributed projects such
as gnome, Mozilla kde etc etc. Putting the Linux kernel into CVS for
instance would be a disaster. What people forget when they suggest using
CVS is that there is one central server that you have to contact to keep in
sync. A project running from a CVS server in the US is a project which
is difficult for a european to work with. A single point of failure, and
slow connect and update times. I really feel strongly that CVS is something
that shouldn't be used for gnome or Mozilla. It's fine if there is a
committed group of core developers who live near the server :-), but it makes
it very difficult for the casual hacker to sweep in and make a few
modifications, you have to find the server, CVS login, CVS checkout, CVS
update, compile, it doesn't compile, you mess with it for a while, it
compiles but at this stage you're bored because you only wanted to see if
it is supporting graphics yet so that you can test adding tiff support to
CVS to me is inherently a catheral masquerading as a bazaar, CVS is a small
moat to keep the hordes one step removed from you code.
You can make it work correctly though, I think the wine project has the best
CVS system for this kind of project, i.e.
- Regular weekly/biweekly tars and diffs of the tree that actually build
which are mirrored everywhere, and are close enough to the current CVS tree
that an update takes a very small time.
- Mailinglist of changes that were made to the tree.
- Its in Denmark which is reasonably fast for me :-)
Maybe the CVS people should work on a CVS mirroring system, would be a
toughy to do I think, but worth looking into. Also someone (we are
planning it on skynet) should provide a service to automatically
convert CVS tree to regular tar files and run test builds. Plonk them
in ftp archieve files marked with the date and their build status. Point
the script at the main CVS trees around the world and let it rip, a very
positive service methinks.
Favorite language, C, C++, some half arsed scripting language like perl?
Definitely C, I have some fondness for C++, but you have to keep a tight
rein on yourself to stop overdesigning some very aesthetically pleasing
class hierarchies which provide clean interfaces and hide complexity, which
might look nice but do nothing :-). I find myself eventually looking at a
very nice structure which nevertheless achieves nothing. Lots of passing
the buck around the system sort of pushing the execution flow about the
place aimlessly, but thats probably just me. As for stuff like tcl,perl and
so on, nice stuff but use them for glue between other things, using them to
write something serious seems like a disaster to me.
e) window manager / gnome / kde environment?
- X vs console?
X, loads of xterms I admit, but I like a good graphical system. Couldn't
live without netscape.
For ages I've used afterstep. Gnome appeals to me and on my home machine
I have a few gnome applications running. KDE does not appeal to me I have to
say. I don't really care about the now irrelevent legal stuff, but I just
think that GNOME has the right attitude, pretty nebelusous stuff to base
a preference on but there you go.
- Favorite hardware and os platform?
Whatever I can get, I've been happy with x86 and Linux for ages. So many
of the other unices are sucky when it comes to third party utilities and
libraries. I hate platforms that lack standard stuff like libpng libjpeg
- did you ever attempt to write yet another
- window manager?
Thought about it, dismissed the idea pretty quickly.
Oh yes, I did some serious thinking and started into a widget set before
QT and GTK came out. Written in C++ and called chameleon, it was supposed
to be uber-configurable with each widget a shared library and some dlopen
trickery to allow for instance a fileopen dialog to be completely swapped
with another one completely different in look and feel. Panning widgets
could be swapped so that you can go from a system with scrollbars on the
left and bottom to one where you got the panner style of scrolling that
you see in apps like editres. Nice idea, never really got off the ground
because I spent my time trying to munge the exiting Xaw and other Xt based
widgets into the framework. I wanted to leverage all the existing widgets
into the system and failed miserably.
- X replacement?
No way, I do want to see some X improvements though. Printing and fonts
are really a painful area that we come against over and over again, but
I think they can be sorted out with custom extensions and so on. But
replacing X sounds foolish to me. Sure you can attempt to change the
architecture of our Xservers completely and that might be nice, so long
as we can run all the existing X programs and use the same protocol I'm
- programming language?
- operating system?
Nope to that too.
Has anyone offered to pay you/hire you based upon your open source
Yep :-), 8 job offers in 10 months and a few small contracts for modifications
and extensions, nothing big but got me out of overdraft a few times, though I'm
back in it again.
What application or component is Linux most lacking?
Video editing tools like adobe premier, and the lack of sound creating
and editing tools is woeful. We have StarOffice for word processing,
we have Gnome for looking good, Gnumeric for spreadsheets, Oracle and
MySql etc etc for databases, Gimp for graphics. Basically we're looking
good. Some work needed to round out the full suite of applications, but
no real sound tools, no real video tools. No equivalents for any good
Is that your first interview by anyone?
Damien O'Sullivan asked:
When are you going to cut your hair like airlied and Kate??
When I'm as old and senile as them?
Do you feel tempted to answer this question on list??
No, just to stick a fork in your eye.
Mark Twomey asked a question in two parts...
- In your opinion if Linux is to become a mass market solution, available
to users of different technical ability and skill, what segments in the
community's current development process (if any) do you believe have to
Some of the larger Linux companies need to do some serious user studies
of the usage of the systems, identify the bottlenecks that users
perceive and get actual evidence of what Linux needs. Some actual field
studies, formal testing and evidence gathering. There are far too many
unsubstantiated opinions on what needs to be done or what direction
Linux should go in. I think we need to actually video tape a couple of
hundred Linux users in their workplaces and study what they do. See what
applications they use, what problems they face and what the requirements
are. Free software is reaction based, find a problem and fix it. I think
that we need to identify clearly some of the problems that are so big and
obvious that we cannot even see them anymore.
Following that, how soon do you believe we can move from the "hard and
fast" development cycle, to delivering elegant, usable solutions that will
appeal to a mass market?
The hard and fast development cycle gets stuff done. Designing elegent
solutions that live in the clouds and never actually get implemented is
a trap that we must avoid. I do think that it would be a good idea for RedHat
and LinuxCare etc to hire polishers. People who's job is to keep an eye
on the software scene and add the last layer of polish to programs, i.e.
Make them look good. Port programs fully over to Linux, modify programs
to use standard libraries. Break programs into reusable libraries etc..
A small team of "software gardeners" to add that last few percent to programs
to bring everything that last step.
But I do think the commercial Linux companies have a large part to play,
and it's to their advantage to support the existing software model. I think
they could be a very positive force to improve the overall level of
usability of Linux.
- What made you work on MsWordview in the first place?
Same as most gnu style projects, I needed to read word documents, and
nothing was available for Linux, even staroffice at the time could
only handle non fastsaved documents. And strings didn't really cut it :-)
- What is your motivation for working on free software?
Sticking it to the man? No really it was just to fill a void.
Programmers abhor a vacuum maybe? No-one else was doing it, it
needed doing and I reckoned I could make enough of a start to get
the ball rolling, though I have to admit a certain lack of submitted
code. I attribute most of that to my lack of comments, unclear code
and hodge podge approach to problem solving :-)
Do you work on free software because it's entertaining or because of
the "fame" it gets you?
Entertaining, I could regale you for hours on the joys of parsing
undocumented structures. Theres nothing like finding the solution to
an intractable problem to get that fuzzy glow of accomplishment.
What do you think motivates people to work on free software?
Its not fame based. I see many projects starting from large public
announcements and grandious crowd pleasing plans that never seem to
pan out. The sucessful ones start with a few people who have a well
defined problem that affects them directly in their own day to day
life and they attempt to solve it. The "free" bit is an afterthought
for most people. Free because it was to solve a problem of their
own, and now that they have achieved that, why not solve everyones
else at the same time. Personally I was motivated to release it
for free because of all the free stuff that *I* had used, where
would we be without gcc, Linux and the gnu project. If you havn't
the money to donate to gnu etc etc you might as well give code in
If you had to start MsWordView again, would you?
Yeah, it was good fun.
Is the "fame" worth the grief?
Theres no fame, maybe an email or two a day saying thanks, which
is nice . But temper that against four or five looking for a new feature
or moaning that it doesn't compile under sgi or hpux or sommat and it can
get you down some days as well.
Do you think the reward for msWordView (many kudos) exceeded the
effort it took to write it in the first place?
I suppose it has balanced out fairly well for me. Im content the way things
have panned out.
What brought you from computer engineering to software engineering
which are VERY different disciplines?
Im not too sure what you're getting at here, as I'm not doing software
engineering, Im currently doing Human Computer Interaction, which was
part of my masterplan to gain the kind skills that I believe are necessary
to design and write quality software which people directly interact with, it
ties in nicely with computer engineering where you never consider the people
that use the system, just the technical issues. Raw HCI on the other hand
can often lack a certain hard focus that you get in engineering disciplines,
so I think they both tally well together.
Do you think you will continue work on MsWordView even though you will
be paid by Star division to work on their stuff?
Dunno really, we'll wait and see. I need to move on a bit, Ive stagnated in
Limerick far too long, so I'd like a clean break from everything that I'm
doing right now.
Are you going to cut your hair or is your programming ability tied to
the length to your hair like Samson?
Is a programmers ability connected to the length of their hair?
Yes, it's definitely directly proportional to the length of hair. Never heard
of a good shorthaired programmer. Hair is a bogon shield, the more you have,
the less bugs you get.
In your opinion, how important is facial hair to GPL software?
Facial hair is completely different of course, beards are strange attractors
and cause all sorts of problems. I'm a bit concerned about this strange
obsession with bodily hair that crops up frequently in these questions, odd
Harry Walsh asked:
How much time do you spend every week on your projects?
Far far too much, my thesis has suffered incredibly, I spend about 2
days of my 9-5 time working in it, and usually every evening, though
the evenings are usually spend reading a book while checking on
the progress of the automated testing searching for crashes and
malformed output html.
Why do you do it? Why the word format? wvWare has become
a very useful library now. But in the beginning it must have looked
like such a mammoth task. Did you always intend it to get this far,
or is it snowballing like Linux did?
It completely snowballed, there were a few text extractors for word
docs previous, basically strings with bells and whistles attached,
but I gave the spec a read when it came avaiable and decided to
implement the "fastsave" algorithm so as to be able to dump just
the text of the document. That required a load of code :-), tables
seemed natural after that, and lists, then graphics. Some character
properties seemed like a good idea then, just bold and italic I
said to myself, that got out of hand quickly. Para properties, etc etc.
The abiword project started and began using mswordview as their
word import filter, and mswordview was always intended as a quick and
dirty word to html filter so I had to call a halt and restart the whole
thing as wv, so that abiword and others could use it. This time round
I added word 6 and 7 support. Then I had to implement a wmf converter
as well because there was none for Linux, then I got sucked into
implementing the "escher file format" in which most of the graphics
are embedded. Its escher that I'm looking at today for instance.
The encrypted word97 document problem annoyed me for a long while, it
was always low priority, but niggling at me for ages. It was a real
internet power of colloboration thing that got that started and
completed, once 97 was done I could do 95/6 encryption in an evening
as its poxy encryption. I started, but stopped as it was getting really
off into a tangent, a word6/7 encryption recovery program. You don't
really need the password to decrypt them, there is so much available
non encrypted information which you can use to decrypt it anyway.
The thing was that I got a bit obsessive about the whole conversion
thing, I believe that it's of paramount importance for Linux/unix
to be able to interoperate transparently with the rest of the computing
world. Better samba, better office import ability, up to date
readers for .ra , modern html, vector file formats, etc etc. These are
vital. Its the interoperability of data that I believe is a prime limiting
factor for the take up of Linux by ordinary users. You can have the best
filesystem, fastest graphics, coolest interface, but it's all nothing unless
you can move over to Linux from your older system and bring both your own
older data with you and keep interoperability with your friends on their
legacy systems such as windows and macos.
What next? wvWare can only go so far. Do you intend to
chase Microsoft file formats for ever? I think someone with your
initiative and motivation could move onto more important things if a
strong and motivated wvWare development team were to form. What are
your thoughts on this whole area?
Im joining stardivision/sun in janurary (signed contract today), initially
to continue on in the same vein as before, and certainly into different
areas after that. But the whole data transformation thing facinates me (at
the moment anyway). Data transformations and serious code reuse is what
Im thinking about at now. The kind of things that I envison are metaformats
and transformation routes between data formats, data translation and
handling orbs. Universal data formats for various data families.
XML is an envelope into which you can put data and unicode is a format in
which you can store your raw text, but theres more to be done yet.
Universal drawing, word processing and spreadsheet formats, mechanisms to
find a route with the minimal loss of data between formats, methods of
determing a numerical value to put on the quality of a transformation
route. You receive something in your mail, click on it and the client
contacts its local translation broker with a list of what you can handle
and it get munged into one of those formats.
Other things that bother me in this family of problem is that not alone
should we use libraries for everything, one libpng, one libjpeg. But we
should follow this technique through to everything. Extended to the absolute
hilt. You can use XML to describe a UI, you can use libraries to reuse and
share code, why do we have so many window managers, email programs, crap
browsers, editors etc etc etc. We should have one -lWindowManager which
provides all the core code that a window manager needs, one libpop which
will pop mail down for you, one libimap, etc etc etc. One core
program whos ui is a XML description that can be tweaked and utterly modified.
Not just change a few skins, but change the layout etc etc. You can see
feelers in that direction in many well designed programs that you see, the
new Mozilla for instance. Just about every program that exists on a unix
system should be max about 50 60k, with the vast majority of code in big
documented libraries that you can grab functionality out of in an instant.
That whole perl reimplementation of all the unix basic commands that I saw
underway at one stage is something that should be done in C for Linux. I want
libls dammit !, I've got a brand new interface for ls and I want to do it now,
yesterday. system and pipes et al are all very well, but you really need to
link against the bastards to do everything you want without grief. One
bloody link list implementation, give me more of that glib stuff and ill take
a dash of STL while I'm at it, and give me more useful libraries. If you write
a graphical program put the vast majority of your non graphical code in a
seperate library and link against it. It makes the world a more pleasant
John "Kate" Looney asked:
What do you think it is that drives perfectly normal people to re-write
"hi-profile" software, like window managers and mail clients, when there
are so many out there already, in desperate need of improvement?
I reckon most of these types of rewrites are slightly outside the usual
"I have a problem, I must fix it". They are usually of the "I want to
write some software to get some practice/kudos/learn something about how
it all works". Which are all admirable reasons to do something, the catch
with this approach is that you don't have a well defined goal. So as soon
as you gain that practice/realize noone want to hear about your yet another
thingy/learn what you want to know, then the project tends to dissappear
into the mud again. Usually half finished and missing enough functionality
to be useful. I also wonder how many projects stop because in the rewrite
of an alternative to an exiting project with a certain flaw, the new writer
realizes that the flaw they want to avoid in their swishy new project is
a workaround for a problem which is intractable otherwise. Or they find that
their shiny new feature is going to be incredibly difficult to achieve, like
for instance yet another replacement for X.
But as to precisely answer why they don't improve an existing project, I
imagine that its a combination of a few factors, maybe
Personally I think you can avoid a and b by sticking everything in multiple
libraries document the api carefully, and then people can *own* their own
project which sits on top, which also gives them a good reason to improve the
understructure. If window managers worked like this, I imagine we'd have 10
window managers, but they would all be far more closer to completion and much
much much more compatible and bug free.
- The initial time taken to get up to speed with a couple of thousand
of the original is offputting, and the author would rather just get straight
- People want to have their *own* project, not to be a spoke in someone
- He may just be unaware of the other project
If Microsoft were to force to open source their cash-cows, like office and
windows, would you try to make sense of the code, to better your own
software? Can you think of anything else you would could do with the code
Cut off its head and fill the exposed trachea with garlic whilst pounding
stakes into various locations maybe?
The microsoft code would of course be useful, but nowhere as useful as
you might imagine, the MS world is a different place, the kind of place
where they have C++ classes with serialize methods that save to
disk by writing the structure raw to disk without any of that fancy pants
conversion to a rational file format that you might see in other products.
I really honestly feel that MS itself has lost track of how their older
cashcow products actually work. The code base has evolved from the original
versions slowly and inexorably, never does a company like MS really say to
themselves that "this has become a shambles, lets set 100 people aside and
redesign this beast and start again", they have to always run to stay where
they are. The file format change from 7 to 8 involved converting all the
characters to unicode characters, adding a new list and graphic escher format
because they couldn't continue using the old one any longer because either
a) they forgot how it worked, or b) (more likely I suppose) it was a very poor
implemention in the first place. The scary thing is, is that both the old list
format and graphic format can still show up in the new file format, pretty
much at random when the document was converted to 97 in 97 from an older
format. If they cannot get it together with access to the code for 10 years,
letting us in on it might not help greatly. Despite all this, yes of course
seeing the code will improve everyones lot, especially MS themselves,
I include a quote that came my way from some indian software developers
who had contacted the asian microsoft branch on the word format topic
"We requested Microsoft regarding this. They said, they are not supporting
MS-Word's file format to developers. The reason they say is, they themselves
do not have complete (clear) documentation.
Also they pointed your website
http://www.wvware.com for additional reference"
If you could get other people, and other peoples money, what
sort of an event would you organise, like GNOME vs. KDE paintball, or AOL
install CD skeet shoot..?
Quasar with k&r indentation style users vs everyone else? die k&r
users, budda budda budda, could be good.
Michael Monty Widenius
About the author, Ken Guest.