On Mon, Jan 24, 2000 at 05:19:36PM +0000, Michael Conry wrote:
> I have noticed during the boot sequence that there are
> warnings/errors about 5 or so unresolved symbols. These are all
> associated with ne.o. I do not honestly know if these errors were there
> from the start, or whether they have occurred since. Since the install,
> the main changes I have made to the system have been adding a printer,
> installing LaTeX, and installing and configuring Samba. I have not
> recompiled the kernel (though I have done it on another system, and I
> realise that this ties up with the symbols issue through System.map).
> The network card still seems to be working fine.
System.map has nothing to do with kernel modules.
Here comes the science...
System.map is used by 'ps' and 'syslog' to convert kernel addresses
into function names. Doing 'ps l' will show you the names of the
functions that the processes are currently waiting in.
Symbols in kernel modules are a different beast. When you compile a
kernel, certain functions and global variables are 'exported' by the
kernel. (Only stuff that is needed by modules is exported, rather
than every internal kernel function.) This results in an exported
symbol table that's embedded in the kernel image itself. These
are symbols such as 'sk_alloc' and 'sk_free' which would be used
by networking modules.
When you compile a module that calls these functions, the linker
is unable to completely 'resolve' these symbols. In other words,
it can't find an address to go along with these functions. And
that's OK. Instead, when insmod loads a module, it looks for
each of these unresolved symbols in the kernel's exported symbol
table and finds the address of these functions in this particular
kernel.
That way, if the kernel gets recompiled with different options,
and sk_alloc() is moved higher or lower in memory, insmod is
still able to find it and hook up the module's calls to sk_alloc()
accordingly.
So, in this case, an unresolved symbol at module load time means
that, for example, the module is trying to use functions that
just don't exist in that kernel. For example, a SCSI driver
module won't load if the kernel was compiled with absolutely
no SCSI support, because it won't be exporting functions such
as scsi_do_cmd().
However, there is another wrangle... The kernel developers
don't want to fix the details of these exported functions in
stone. If they did, kernel development would get significantly
slower, and we'd end up with a horrible mess of code to deal
with evil backward compatibility issues. They want to be able
to do things like add another parameter to scsi_do_cmd().
But that would break compatibility with any modules that were
compiled for the older version. The modules would load
fine, but probably crash because they are calling
scsi_do_cmd() with the wrong arguments.
Instead, we have a neat hack called modversions. During a
kernel compile, the exported symbol names are mangled by
adding on a suffix _that_depends_on_the_argument_list_.
For example, in my current kernel, scsi_do_cmd() is
actually exported as scsi_do_cmd_R0f4ba7a7() and modules
compiled for my kernel refer to it my this name.
Now, as long as the argument list (and argument types and
return value type) of scsi_do_cmd stay the same in future
(and past) kernels, resolution of this symbol in a scsi driver
module will work. But, as soon as a kernel hacker changes the
'signature' of scsi_do_cmd(), my scsi drivers will fail to
load (until I re-compile them for this new kernel).
So this modversions hack allows modules compiled for one
kernel to work with a different kernel. More importantly,
the module will _fail_to_load_ if there is a compatibility
problem, rather than load fine and crash horribly.
> Anyway does anyone know how these errors occurred and how to get rid
> of them? I would be very grateful for any help (can't find answer in
> Linux in nutshell, SuSE manual, or ILUG archive)
Now, in your case, if you have a distribution-supplied
kernel and a distribution-supplied ne.o, you _shouldn't_
have any trouble with it. Are you _sure_ the problem is
with ne.o? Try booting to single-user mode (or just shut
down all networking) and do an 'rmmod -a' to make sure
the module is not loaded. Then do 'modprobe ne' to load
it and its dependent modules. If you get errors now,
then we can be sure it is caused by ne.o.
If so, tell SuSE, or check their site for updates.
Personally I think it is unlikely, because unresolved symbols
should mean ne.o fails to load and you don't get any
networking. Yet you've got Samba up and running. Something
doesn't add up...
Later,
Kenn
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!