pkg://Linux-HOWTOs.tar.gz:1682658/SCSI-HOWTO
downloads
The Linux SCSI HOWTO
Drew Eckhardt,<drew@PoohSticks.ORG> (transformed to linuxdoc-sgml for
mat by Dieter Faulbaum), <faulbaum@bii.bessy.de>
v2.30, 30 August 1996
1. Introduction
This documentation is free documentation; you can redistribute it
and/or modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.
This documentation is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License
along with this documentation; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
That said, I'd appreciate it if people would ask me
<drew@PoohSticks.ORG> if there's a newer version available before they
publish it. When people publish outdated versions, I get questions
from users that are answered in newer versions, and it reflects poorly
on the publisher. I'd also prefer that all references to free
distribution sites, and possibly competing distributions/products be
left intact.
IMPORTANT :
BUG REPORTS OR OTHER REQUESTS FOR HELP WHICH FAIL TO FOLLOW THE
PROCEDURES OUTLINED IN THE ``REPORTING BUGS'' SECTION WILL BE IGNORED.
This HOWTO covers the Linux SCSI subsystem, as implemented in Linux
kernel revision 1.2.10 and newer alpha code. Earlier revisions of the
SCSI code are _unsupported_, and may differ significantly in terms of
the drivers implemented, performance, and options available.
For additional information, you may wish to join the linux-scsi
mailing list by mailing majordomo@vger.rutgers.edu with the line
subscribe linux-scsi
in the text. You can unsubscribe by sending mail to the same address
and including
unsubscribe linux-scsi
in the text.
Once you're subscribed, you can send mail to the list at
linux-scsi@vger.rutgers.edu
I'm aware that this document isn't the most user-friendly, and that
there may be inaccuracies and oversights. If you have constructive
comments on how to rectify the situation you're free to mail me about
it.
2. Common Problems
This section lists some of the common problems that people have. If
there is not anything here that answers your questions, you should
also consult the sections for your host adapter and the devices in
that are giving you problems.
2.1. General Flakiness
If you experience random errors, the most likely causes are cabling
and termination problems.
Some products, such as those built around the newer NCR chips, feature
digital filtering and active signal negation, and aren't very
sensitive to cabling problems.
Others, such as the Adaptec 154xC, 154xCF, and 274x, are _extremely_
sensitive and may fail with cables that work with other systems.
I reiterate : some host adapters are _extremely_ sensitive to cabling
and termination problems and therefore, cabling and termination should
be the first things checked when there are problems.
To minimize your problems, you should use cables which
1. Claim SCSI-II compliance
2. Have a characteristic impedance of 132 ohms
3. All come from the same source to avoid impedance mismatches
4. Come from a reputable vendor such as Amphenol
Termination power should be provided by _all_ devices on the SCSI bus,
through a diode to prevent current backflow, so that sufficient power
is available at the ends of the cable where it is needed. To prevent
damage if the bus is shorted, TERMPWR should be driven through a fuse
or other current limiting device.
If multiple devices, external cables, or FAST SCSI 2 are used, active
or forced perfect termination should be used on both ends of the SCSI
bus.
See the Comp.Periphs.Scsi FAQ (available on tsx-11 in
pub/linux/ALPHA/scsi) for more information about active termination.
2.2. The kernel command line
Other parts of the documentation refer to a "kernel command line".
The kernel command line is a set of options you may specify from
either the LILO : prompt after an image name, or in the append field
in your LILO configuration file (LILO .14 and newer use
/etc/lilo.conf, older versions use /etc/lilo/config).
Boot your system with LILO, and hit one of the alt, control, or shift
keys when it first comes up to get a prompt. LILO should respond with
:
At this prompt, you can select a kernel image to boot, or list them
with ?. Ie
:?
ramdisk floppy harddisk
To boot that kernel with the command line options you have selected,
simply enter the name followed by a white space delimited list of
options, terminating with a return.
Options take the form of
variable=valuelist
Where valuelist may be a single value or comma delimited list of
values with no whitespace. With the exception of root device,
individual values are numbers, and may be specified in either decimal
or hexadecimal.
Ie, to boot linux with an Adaptec 1520 clone not recognized at bootup,
you might type
:floppy aha152x=0x340,11,7,1
If you don't care to type all of this at boot time, it is also
possible to use the LILO configuration file "append" option with LILO
.13 and newer.
Ie,
append="aha152x=0x340,11,7,1"
2.3. A SCSI device shows up at all possible IDs
If this is the case, you have strapped the device at the same address
as the controller (typically 7, although some boards use other
addresses, with 6 being used by some Future Domain boards).
Please change the jumper settings.
2.4. A SCSI device shows up at all possible LUNs
The device has buggy firmware.
As an interim solution, you should try using the kernel command line
option
max_scsi_luns=1
If that works, there is a list of buggy devices in the kernel sources
in drivers/scsi/scsi.c in the variable blacklist. Add your device to
this list and mail the patch to Linus Torvalds
<Linus.Torvalds@cs.Helsinki.FI>.
2.5. You get sense errors when you know the devices are error free
Sometimes this is caused by bad cables or improper termination.
See section ``General Flakiness''
2.6. A kernel configured with networking does not work
The auto-probe routines for many of the network drivers are not
passive, and will interfere with operation with some of the SCSI
drivers.
2.7. Device detected, but unable to access
A SCSI device is detected by the kernel, but you are unable to access
it - ie mkfs /dev/sdc, tar xvf /dev/rst2, etc fails.
You don't have a special file in /dev for the device.
Unix devices are identified as either block or character (block
devices go through the buffer cache, character devices do not)
devices, a major number (ie which driver is used - block major 8
corresponds to SCSI disks) and a minor number (ie which unit is being
accessed through a given driver - ie character major 4, minor 0 is the
first virtual console, minor 1 the next, etc). However, accessing
devices through this separate namespace would break the unix/Linux
metaphor of "everything is a file," so character and block device
special files are created under /dev. This lets you access the raw
third SCSI disk device as /dev/sdc, the first serial port as
/dev/ttyS0, etc.
The preferred method for creating a file is using the MAKEDEV script -
cd /dev
and run MAKEDEV (as root) for the devices you want to create - ie
./MAKEDEV sdc
wildcards "should" work - ie
./MAKEDEV sd\*
"should" create entries for all SCSI disk devices (doing this should
create /dev/sda through /dev/sdp, with fifteen partition entries for
each)
./MAKEDEV sdc\*
"should" create entries for /dev/sdc and all fifteen permissible
partitions on /dev/sdc, etc.
I say "should" because this is the standard unix behavior - the
MAKEDEV script in your installation may not conform to this behavior,
or may have restricted the number of devices it will create.
If MAKEDEV won't do the right magic for you, you'll have to create the
device entries by hand with the mknod command.
The block/character type, major, and minor numbers are specified for
the various SCSI devices in section ``Device Files'' in the
appropriate section.
Take those numbers, and use (as root)
mknod /dev/device b|c major minor
ie -
mknod /dev/sdc b 8 32
mknod /dev/rst0 c 9 0
2.8. SCSI System Lockups
This could be one of a number of things. Also see the section for
your specific host adapter for possible further solutions.
There are cases where the lockups seem to occur when multiple devices
are in use at the same time. In this case, you can try contacting the
manufacturer of the devices and see if firmware upgrades are available
which would correct the problem. If possible, try a different scsi
cable, or try on another system. This can also be caused by bad
blocks on disks, or by bad handling of DMA by the motherboard (for
host adapters that do DMA). There are probably many other possible
conditions that could lead to this type of event.
Sometimes these problems occur when there are multiple devices in use
on the bus at the same time. In this case, if your host adapter
driver supports more than one outstanding command on the bus at one
time, try reducing this to 1 and see if this helps. If you have tape
drives or slow cdrom drives on the bus, this might not be a practical
solution.
2.9. Configuring and building the kernel
Unused SCSI drivers eat up valuable memory, aggravating memory
shortage problems on small systems because kernel memory is unpagable.
So, you will want to build a kernel tuned for your system, with only
the drivers you need installed.
cd to /usr/src/linux
If you are using a root device other than the current one, or
something other than 80x25 VGA, and you are writing a boot floppy, you
should edit the makefile, and make sure the
ROOT_DEV =
and
SVGA_MODE =
lines are the way you want them.
If you've installed any patches, you may wish to guarantee that all
files are rebuilt. If this is the case, you should type
make mrproper
Irregardless of weather you ran make mrproper, type
make config
and answer the configuration questions. Then run
make depend
and finally
make
Once the build completes, you may wish to update the lilo
configuration, or write a boot floppy. A boot floppy may be made by
running
make zdisk
2.10. LUNS other than 0 don't work
Many SCSI devices are horrendously broken, lock the SCSI bus up solid,
and do other bad things when you attempt to talk to them at a logical
unit someplace other than zero.
So, by default recent versions of the Linux kernel will not probe luns
other than 0. To work around this, you need to the max_scsi_luns
command line option, or recompile the kernel with the
CONFIG_SCSI_MULTI_LUN option.
Usually, you'll put
max_scsi_luns=8
on your LILO command line.
If your multi-LUN devices still aren't detected correctly after trying
one of these fixes (as the case will be with many old SCSI->MFM, RLL,
ESDI, SMD, and similar bridge boards), you'll be thwarted by this
piece of code
/* Some scsi-1 peripherals do not handle lun != 0.
I am assuming that scsi-2 peripherals do better */
if((scsi_result[2] & 0x07) == 1 &&
(scsi_result[3] & 0x0f) == 0) break;
in scan_scsis() in drivers/scsi/scsi.c. Delete this code, and you
should be fine.
3. Reporting Bugs
The Linux SCSI developers don't necessarily maintain old revisions of
the code due to space constraints. So, if you are not running the
latest publically released Linux kernel (note that many of the Linux
distributions, such as MCC, SLS, Yggdrasil, etc. often lag one or even
twenty patches behind this) chances are we will be unable to solve
your problem. So, before reporting a bug, please check to see if it
exists with the latest publically available kernel.
If after upgrading, and reading this document thoroughly, you still
believe that you have a bug, please mail a bug report to the SCSI
channel of the mailing list where it will be seen by many of the
people who've contributed to the Linux SCSI drivers.
In your bug report, please provide as much information as possible
regarding your hardware configuration, the exact text of
all of the messages that Linux prints when it boots, when the error
condition occurs, and where in the source code the error is. Use the
procedures outlined in ``Capturing messages'' and ``Locating the
source of a panic()''.
Failure to provide the maximum possible amount of information may
result in misdiagnosis of your problem, or developers deciding that
there are other more interesting problems to fix.
The bottom line is that if we can't reproduce your bug, and you can't
point at us what's broken, it won't get fixed.
3.1. Capturing messages
If you are not running a kernel message logging system :
Insure that the /proc filesystem is mounted.
grep proc /etc/mtab
If the /proc filesystem is not mounted, mount it
mkdir /proc
chmod 755 /proc
mount -t proc /proc /proc
Copy the kernel revision and messages into a log file
cat /proc/version > /tmp/log
cat /proc/kmsg >> /tmp/log
Type CNTRL-C after a second or two.
If you are running some logger, you'll have to poke through the
appropriate log files (/etc/syslog.conf should be of some use in
locating them), or use dmesg.
If Linux is not yet bootstrapped, format a floppy diskette under DOS.
Note that if you have a distribution which mounts the root diskette
off of floppy rather than RAM drive, you'll have to format a diskette
readable in the drive not being used to mount root or use their
ramdisk boot option.
Boot Linux off your distribution boot floppy, preferably in single
user mode using a RAM disk as root.
mkdir /tmp/dos
Insert the diskette in a drive not being used to mount root, and mount
it. Ie
mount -t msdos /dev/fd0 /tmp/dos
or
mount -t msdos /dev/fd1 /tmp/dos
Copy your log to it
cp /tmp/log /tmp/dos/log
Unmount the DOS floppy
umount /tmp/dos
And shutdown Linux
shutdown
Reboot into DOS, and using your favorite communications software
include the log file in your trouble mail.
3.2. Locating the source of a panic()
Like other unices, when a fatal error is encountered, Linux calls the
kernel panic() function. Unlike other unices, Linux doesn't dump core
to the swap or dump device and reboot automatically. Instead, a
useful summary of state information is printed for the user to
manually copy down. Ie :
Unable to handle kernel NULL pointer dereference at virtual address c0000004
current->tss,cr3 = 00101000, %cr3 = 00101000
*pde = 00102027
*pte = 00000027
Oops: 0000
EIP: 0010:0019c905
EFLAGS: 00010002
eax: 0000000a ebx: 001cd0e8 ecx: 00000006 edx: 000003d5
esi: 001cd0a8 edi: 00000000 ebp: 00000000 esp: 001a18c0
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Process swapper (pid: 0, process nr: 0, stackpage=001a09c8)
Stack: 0019c5c6 00000000 0019c5b2 00000000 0019c5a5 001cd0a8 00000002 00000000
001cd0e8 001cd0a8 00000000 001cdb38 001cdb00 00000000 001ce284 0019d001
001cd004 0000e800 fbfff000 0019d051 001cd0a8 00000000 001a29f4 00800000
Call Trace: 0019c5c6 0019c5b2 0018c5a5 0019d001 0019d051 00111508 00111502
0011e800 0011154d 00110f63 0010e2b3 0010ef55 0010ddb7
Code: 8b 57 04 52 68 d2 c5 19 00 e8 cd a0 f7 ff 83 c4 20 8b 4f 04
Aiee, killing interrupt handler
kfree of non-kmalloced memory: 001a29c0, next= 00000000, order=0
task[0] (swapper) killed: unable to recover
Kernel panic: Trying to free up swapper memory space
In swapper task - not syncing
Take the hexadecimal number on the EIP: line, in this case 19c905, and
search through /usr/src/linux/zSystem.map for the highest number not
larger than that address. Ie,
0019a000 T _fix_pointers
0019c700 t _intr_scsi
0019d000 t _NCR53c7x0_intr
That tells you what function its in. Recompile the source file which
defines that function file with debugging enabled, or the whole kernel
if you prefer by editing /usr/src/linux/Makefile and adding a "-g" to
the CFLAGS definition.
#
# standard CFLAGS
#
Ie,
CFLAGS = -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -pipe
becomes
CFLAGS = -g -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -pipe
Rebuild the kernel, incrementally or by doing a
make clean
make
Make the kernel bootable by creating an entry in your /etc/lilo.conf
for it
image = /usr/src/linux/zImage
label = experimental
and re-running LILO as root, or by creating a boot floppy
make zImage
Reboot and record the new EIP for the error.
If you have script installed, you may want to start it, as it will log
your debugging session to the typescript file.
Now, run
gdb /usr/src/linux/tools/zSystem
and enter
info line *<your EIP>
Ie,
info line *0x19c905
To which GDB will respond something like
(gdb) info line *0x19c905
Line 2855 of "53c7,8xx.c" starts at address 0x19c905 <intr_scsi+641&>
and ends at 0x19c913 <intr_scsi+655>.
Record this information. Then, enter
list <line number>
Ie,
(gdb) list 2855
2850 /* printk("scsi%d : target %d lun %d unexpected disconnect\n",
2851 host->host_no, cmd->cmd->target, cmd->cmd->lun); */
2852 printk("host : 0x%x\n", (unsigned) host);
2853 printk("host->host_no : %d\n", host->host_no);
2854 printk("cmd : 0x%x\n", (unsigned) cmd);
2855 printk("cmd->cmd : 0x%x\n", (unsigned) cmd->cmd);
2856 printk("cmd->cmd->target : %d\n", cmd->cmd->target);
2857 if (cmd) {;
2858 abnormal_finished(cmd, DID_ERROR << 16);
2859 }
2860 hostdata->dsp = hostdata->script + hostdata->E_schedule /
2861 sizeof(long);
2862 hostdata->dsp_changed = 1;
2863 /* SCSI PARITY error */
2864 }
2865
2866 if (sstat0_sist0 & SSTAT0_PAR) {
2867 fatal = 1;
2868 if (cmd && cmd->cmd) {
2869 printk("scsi%d : target %d lun %d parity error.\n",
Obviously, quit will take you out of GDB.
Record this information too, as it will provide a context in case the
developers' kernels differ from yours.
4. Modules
This section gives specific details regarding the support for loadable
kernel modules and how it relates to SCSI.
4.1. General Information
Loadable modules are a means by which the user or system administrator
can load files into the kernel's memory in such a way that the
kernel's capabilities are expanded. The most common usages of modules
are for drivers to support hardware, or to load filesytems.
There are several advantages of modules for SCSI. One is that a
system administrator trying to maintain a large number of machines can
use a single kernel image for all of the machines, and then load
kernel modules to support hardware that is only present on some
machines.
It is also possible for someone trying to create a distribution to use
a script on the bootable floppy to query for which modules to be
loaded. This saves memory that would otherwise be wasted on unused
drivers, and it would also reduce the possibility that a probe for a
non-existent card would screw up some other card on the system.
Modules also work out nicely on laptops, which tend to have less
memory than desktop machines, and people tend to want to keep the
kernel image as small as possible and load modules as required. Also,
modules makes supporting PCMCIA SCSI cards on laptops somewhat easier,
since you can load and unload the driver as the card is
inserted/removed. [Note: currently the qlogic and 152x drivers support
PCMCIA].
Finally, there is the advantage that kernel developers can more easily
debug and test their drivers, since testing a new driver does not
require rebooting the machine (provided of course that the machine has
not completely crashed as a result of some bug in the driver).
Although modules are very nice, there is one limitation. If your root
disk partition is on a scsi device, you will not be able to use
modularized versions of scsi code required to access the disk. This
is because the system must be able to mount the root partition before
it can load any modules from disk. There are people thinking about
ways of fixing the loader and the kernel so that the kernel can self-
load modules prior to attempting to mount the root filesystem, so at
some point in the future this limitation may be lifted.
4.2. Module support in the 1.2.N kernel
In the 1.2.N series of kernels, there is partial support for SCSI
kernel modules. While none of the high level drivers (such as disk,
tape, etc) can be used as modules, most of the low level drivers (i.e.
1542, 1522) can be loaded and unloaded as required. Each time you
load a low-level driver, the driver first searches for cards that can
be driven. Next, the bus is scanned for each card that is found, and
then the internal data structures are set up so as to make it possible
to actually use the devices attached to the cards that the driver is
managing.
When you are through with a low-level driver, you can unload it. You
should keep in mind that usage counts are maintained based upon
mounted filesystems, open files, etc, so that if you are still using a
device that the driver is managing, the rmmod utility will tell you
that the device is still busy and refuse to unload the driver. When
the driver is unloaded, all of the associated data structures are also
freed so that the system state should be back to where it was before
the module was loaded. This means that the driver could be reloaded
at a later time if required.
4.3. Module support in the 1.3.N kernel
In the 1.3 series of kernels, the scsi code is completely modularized.
This means that you can start with a kernel that has no scsi support
whatsoever, and start loading modules and you will eventually end up
with complete support.
If you wish, you can compile some parts of the SCSI code into the
kernel and then load other parts later - it is all up to you how much
gets loaded at runtime and how much is linked directly into the
kernel.
If you are starting with a kernel that has no support whatsoever for
SCSI, then the first thing you will need to do is to load the scsi
core into the kernel - this is in a module called "scsi_mod". You
will not be able to load any other scsi modules until you have this
loaded into kernel memory. Since this does not contain any low-level
drivers, the act of loading this module will not scan any busses, nor
will it activate any drivers for scsi disks, tapes, etc. If you
answered 'Y' to the CONFIG_SCSI question when you built your kernel,
you will not need to load this module.
At this point you can add modules in more or less any order to achieve
the desired functionality. Usage counts are interlocks are used to
prevent unloading of any component which might still be in use, and
you will get a message from rmmod if a module is still busy.
The high level drivers are in modules named "sd_mod", "sr_mod", "st",
and "sg", for disk, cdrom, tape, and scsi generics support
respectively. When you load a high level driver, the device list for
all attached hosts is examined for devices which the high level driver
can drive, and these are automatically activated.
The use of modules with low level drivers were described in the
section of the ``modules under 1.2 kernels''. When a low-level driver
is loaded, the bus is scanned, and each device is examined by each of
the high level drivers to see if they recognize it as something that
they can drive - anything recognized is automatically attached and
activated.
5. Hosts
This section gives specific information about the various host
adapters that are supported in some way or another under linux.
5.1. Supported and Unsupported Hardware
Drivers in the distribution kernel :
Adaptec 152x, Adaptec 154x (DTC 329x boards usually work, but are
unsupported), Adaptec 174x, Adaptec 274x/284x (294x support requires a
newer version of the driver), BusLogic MultiMaster Host Adapters,
EATA-DMA and EATA-PIO protocol compilant boards (DPT PM2001, PM2011,
PM2012A, PM2012B, PM2021, PM2022, PM2024, PM2122, PM2124, PM2322,
PM2041, PM2042, PM2044, PM2142, PM2144, PM2322, PM3021, PM3122,
PM3222, PM3224, PM3334 some boards from NEC, AT&T, SNI, AST, Olivetti,
and Alphatronix), Future Domain 850, 885, 950, and other boards in
that series (but not the 840, 841, 880, and 881 boards unless you make
the appropriate patch), Future Domain 16x0 with TMC-1800, TMC-18C30,
or TMC-18C50 chips, NCR53c8xx,PAS16 SCSI ports, Seagate ST0x, Trantor
T128/T130/T228 boards, Ultrastor 14F, 24F, and 34F, and Western
Digital 7000.
MCA :
MCA boards which are compatible with a supported board (ie, Adaptec
1640 and BusLogic 640) will work.
Alpha drivers :
Many ALPHA drivers are available at
ftp://tsx-11.mit.edu/pub/linux/ALPHA/scsi
Drivers which will work with modifications
NCR53c8x0/7x0:
A NCR53c8xx driver has been developed, but currently will not work
with NCR53c700, NCR53c700-66, NCR53c710, and NCR53c720 chips. A list
of changes needed to make each of these chips work follows, as well
as a summary of the complexity.
NCR53c720 (trivial) - detection changes, initialization changes, change
fixup code to translate '810 register addresses to
'7xx mapping.
NCR53c710 (trivial) - detection changes, initialization changes,
of assembler, change fixup code to translate '810 register
addresses to '7xx mapping, change interrupt handlers to treat
IID interrupt from INTFLY instruction to emulate it.
NCR53c700, NCR53c700-66 (very messy) - detection changes,
initialization changes, modification of NCR code to not use DSA,
modification of Linux code to handle context switches.
SCSI hosts that will not work :
All parallel->SCSI adapters, Rancho SCSI boards, and Grass Roots SCSI
boards. BusLogic FlashPoint boards, such as the BT-930/932/950, are
currently unsupported.
SCSI hosts that will NEVER work :
Non Adaptec compatible, non NCR53c8xx DTC boards (including the 3270
and 3280).
CMD SCSI boards.
Acquiring programming information requires a non-disclosure agreement
with DTC/CMD. This means that it would be impossible to distribute a
Linux driver if one were written, since complying with the NDA would
mean distributing no source, in violation of the GPL, and complying
with the GPL would mean distributing source, in violation of the NDA.
If you want to run Linux on some other unsupported piece of hardware,
your options are to either write a driver yourself (Eric Youngdale and
I are usually willing to answer technical questions concerning the
Linux SCSI drivers) or to commission a driver (Normal consulting rates
mean that this will not be a viable option for personal use).
5.1.1. Multiple host adapters
With some host adapters (see ``Buyers' Guide : Feature Comparison''),
you can use multiple host adapters of the same type in the same
system. With multiple adapters of the same type in the same system,
generally the one at the lowest address will be scsi0, the one at the
next address scsi1, etc.
In all cases, it is possible to use multiple host adapters of
different types, provided that none of their addresses conflict. SCSI
controllers are scanned in the order specified in the
builtin_scsi_hosts[] array in drivers/scsi/hosts.c, with the order
currently being
BusLogic, Ultrastor 14/34F, Ultrastor 14F,, Adaptec
151x/152x, Adaptec 154x, Adaptec 174x, AIC7XXX, AM53C974,
Future Domain 16x0, Always IN2000, Generic NCR5380, QLOGIC,
PAS16, Seagate, Trantor T128/T130, NCR53c8xx, EATA-DMA,
WD7000, debugging driver.
In most cases (ie, you aren't trying to use both BusLogic and Adaptec
drivers), this can be changed to suit your needs (ie, keeping the same
devices when new SCSI devices are added to the system on a new
controller) by moving the individual entries.
5.2. Common Problems
5.2.1. SCSI timeouts
Make sure interrupts are enabled correctly, and there are no IRQ, DMA,
or address conflicts with other boards.
5.2.2. Failure of autoprobe routines on boards that rely on BIOS for
autoprobe.
If your SCSI adapter is one of the following :
Adaptec 152x, Adaptec 151x, Adaptec AIC-6260, Adaptec
AIC-6360, Future Domain 1680, Future Domain TMC-950, Future
Domain TMC-8xx, Trantor T128, Trantor T128F, Trantor T228F,
Seagate ST01, Seagate ST02, or a Western Digital 7000
and it is not detected on bootup, ie you get a
scsi : 0 hosts
message or a
scsi%d : type
message is not printed for each supported SCSI adapter installed in
the system, you may have a problem with the autoprobe routine not
knowing about your board.
Autodetection will fail for drivers using the BIOS for autodetection
if the BIOS is disabled. Double check that your BIOS is enabled, and
not conflicting with any other peripherial BIOSes.
Autodetection will also fail if the board's "signature" and/or BIOS
address don't match known ones.
If the BIOS is installed, please use DOS and DEBUG to find a signature
that will detect your board -
Ie, if your board lives at 0xc8000, under DOS do
debug
d c800:0
q
and send a message to the SCSI channel of the mailing list with the
ASCII message, with the length and offset from the base address (ie,
0xc8000). Note that the EXACT text is required, and you should
provide both the hex and ASCII portions of the text.
If no BIOS is installed, and you are using an Adaptec 152x, Trantor
T128, or Seagate driver, you can use command line or compile time
overrides to force detection.
Please consult the appropriate subsection for your SCSI board as well
as section ``General Flakiness''.
5.2.3. Failure of boards using memory mapped I/O
(This include the Trantor T128 and Seagate boards, but not the
Adaptec, Generic NCR5380, PAS16, and Ultrastor drivers)
This is often caused when the memory mapped I/O ports are incorrectly
cached. You should have the board's address space marked as
uncachable in the XCMOS settings.
If this is not possible, you will have to disable cache entirely.
If you have manually specified the address of the board, remember that
Linux needs the actual address of the board, and not the 16 byte
segment the documentation may refer to.
Ie, 0xc8000 would be correct, 0xc800 would not work and could cause
memory corruption.
5.2.4. kernel panic : cannot mount root device" when booting an ALPHA
driver boot floppy
You'll need to edit the binary image of the kernel (before or after
writing it out to disk), and modify a few two byte fields (little
endian) to guarantee that it will work on your system.
1. default swap device at offset 502, this should be set to 0x00 0x00
2. ram disk size at offset 504, this should be set to the size of the
boot floppy in K - ie, 5.25" = 1200, 3.5" = 1440.
This means the bytes are
3.5" : 0xA0 0x05
5.25" : 0xB0 0x04
3. root device offset at 508, this should be 0x00 0x00, ie the boot
device.
dd or rawrite the file to a disk. Insert the disk in the first floppy
drive, wait until it prompts you to insert the root disk, and insert
the root floppy from your distribution.
5.2.5. Installing a device driver not included with the distribution
kernel
You need to start with the version of the kernel used by the driver
author. This revision may be alluded to in the documentation included
with the driver.
Various recent kernel revisions can be found at
nic.funet.fi:/pub/OS/Linux/PEOPLE/Linus
as linux-version.tar.gz
They are also mirrored at tsx-11.mit.edu and various other sites.
cd to /usr/src.
Remove your old Linux sources, if you want to keep a backup copy of
them
mv linux linux-old
Untar the archive
gunzip < linux-0.99.12.tar.gz | tar xvfp -
Apply the patches. The patches will be relative to some directory in
the filesystem. By examining the output file lines in the patch file
(grep for ^---), you can tell where this is - ie patches with these
lines
--- ./kernel/blk_drv/scsi/Makefile
--- ./config.in Wed Sep 1 16:19:33 1993
would have the files relative to /usr/src/linux.
Untar the driver sources at an appropriate place - you can type
tar tfv patches.tar
to get a listing, and move files as necessary (The SCSI driver files
should live in /usr/src/linux/kernel/drivers/scsi)
Either cd to the directory they are relative to and type
patch -p0 < patch_file
or tell patch to strip off leading path components. Ie, if the files
started with
--- linux-new/kernel/blk_drv/scsi/Makefile
and you wanted to apply them while in /usr/src/linux, you could cd to
/usr/src/linux and type
patch -p1 < patches
to strip off the "linux-new" component.
After you have applied the patches, look for any patch rejects, which
will be the name of the rejected file with a # suffix appended.
find /usr/src/linux/ -name "*#" -print
If any of these exist, look at them. In some cases, the differences
will be in RCS identifiers and will be harmless, in other cases,
you'll have to manually apply important parts. Documentation on diff
files and patch is beyond the scope of this document.
See also ``Configuring and building the kernel''.
5.2.6. Installing a driver that has no patches
In some cases, a driver author may not offer patches with the .c and
.h files which comprise his driver, or the patches may be against an
older revision of the kernel and not go in cleanly.
1. Copy the .c and .h files into /usr/src/linux/drivers/scsi
2. Add the configuration option
Edit /usr/src/linux/config.in, and add a line in the
*
* SCSI low-level drivers
*
section, add a boolean configuration variable for your driver. Ie,
bool 'Always IN2000 SCSI support' CONFIG_SCSI_IN2000 y
3. Add the makefile entries
Edit /usr/src/linux/drivers/scsi/Makefile, and add an entry like
ifdef CONFIG_SCSI_IN2000
SCSI_OBS := $(SCSI_OBJS) in2000.o
SCSI_SRCS := $(SCSI_SRCS) in2000.c
endif
before the
scsi.a: $(SCSI_OBJS)
line in the makefile, where the .c file is the .c file you copied in,
and the .o file is the basename of the .c file with a .o suffixed.
4. Add the entry points
Edit /usr/src/linux/drivers/scsi/hosts.c, and add a #include for
the header file, conditional on the CONFIG_SCSI preprocessor define
you added to the configuration file. Ie, after
#ifdef CONFIG_SCSI_GENERIC_NCR5380
#include "g_NCR5380.h"
#endif
you might add
#ifdef CONFIG_SCSI_IN2000
#include "in2000.h"
#endif
You will also need to add the Scsi_Host_Template entry into the
scsi_hosts[] array. Take a look into the .h file, and you should find
a #define that looks something like this :
#define IN2000 {"Always IN2000", in2000_detect, \
in2000_info, in2000_command, \
in2000_queuecommand, \
in2000_abort, \
in2000_reset, \
NULL, \
in2000_biosparam, \
1, 7, IN2000_SG, 1, 0, 0}
the name of the preprocessor define, and add it into the scsi_hosts[]
array, conditional on definition of the preprocessor symbol you used
in the configuration file.
Ie, after
#ifdef CONFIG_SCSI_GENERIC_NCR5380
GENERIC_NCR5380,
#endif
you might add
#ifdef CONFIG_SCSI_IN2000
IN2000,
#endif
See also ``Configuring and building the kernel''.
5.2.7. Failure of a PCI board in a Compaq System
A number of Compaq systems map the 32-bit BIOS extensions used to
probe for PCI devices into memory which is inaccessible to the Linux
kernel due to the memory layout. If Linux is unable to detect a
supported PCI SCSI board, and the kernel tells you something like
pcibios_init: entry in high memory, unable to access
Grab
ftp://ftp.compaq.com/pub/softpaq/Software-Solutions/sp0921.zip
which is a self-extracting archive of a program which will relocate
the BIOS32 code.
5.2.8. A SCSI system with PCI boards hangs after the %d Hosts message
Some PCI systems have broken BIOSes which disable interrupts and fail
to reenable them before returning control to the caller. The
following patch fixes this
--- bios32.c.orig Mon Nov 13 22:35:31 1995
+++ bios32.c Thu Jan 18 00:15:09 1996
@@ -56,6 +56,7 @@
#include <linux/pci.h>
#include <asm/segment.h>
+#include <asm/system.h>
#define PCIBIOS_PCI_FUNCTION_ID 0xb1XX
#define PCIBIOS_PCI_BIOS_PRESENT 0xb101
@@ -125,7 +126,9 @@
unsigned long address; /* %ebx */
unsigned long length; /* %ecx */
unsigned long entry; /* %edx */
+ unsigned long flags;
+ save_flags(flags);
__asm__("lcall (%%edi)"
: "=a" (return_code),
"=b" (address),
@@ -134,6 +137,7 @@
: "0" (service),
"1" (0),
"D" (&bios32_indirect));
+ restore_flags(flags);
switch (return_code) {
case 0:
@@ -161,11 +165,13 @@
unsigned char present_status;
unsigned char major_revision;
unsigned char minor_revision;
+ unsigned long flags;
int pack;
if ((pcibios_entry = bios32_service(PCI_SERVICE))) {
pci_indirect.address = pcibios_entry;
+ save_flags(flags);
__asm__("lcall (%%edi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -176,6 +182,7 @@
: "1" (PCIBIOS_PCI_BIOS_PRESENT),
"D" (&pci_indirect)
: "bx", "cx");
+ restore_flags(flags);
present_status = (pack >> 16) & 0xff;
major_revision = (pack >> 8) & 0xff;
@@ -210,7 +217,9 @@
{
unsigned long bx;
unsigned long ret;
+ unsigned long flags;
+ save_flags(flags);
__asm__ ("lcall (%%edi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -221,6 +230,7 @@
"c" (class_code),
"S" ((int) index),
"D" (&pci_indirect));
+ restore_flags(flags);
*bus = (bx >> 8) & 0xff;
*device_fn = bx & 0xff;
return (int) (ret & 0xff00) >> 8;
@@ -232,7 +242,9 @@
{
unsigned short bx;
unsigned short ret;
+ unsigned long flags;
+ save_flags(flags);
__asm__("lcall (%%edi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -244,6 +256,7 @@
"d" (vendor),
"S" ((int) index),
"D" (&pci_indirect));
+ restore_flags(flags);
*bus = (bx >> 8) & 0xff;
*device_fn = bx & 0xff;
return (int) (ret & 0xff00) >> 8;
@@ -254,7 +267,9 @@
{
unsigned long ret;
unsigned long bx = (bus << 8) | device_fn;
+ unsigned long flags;
+ save_flags (flags);
__asm__("lcall (%%esi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -273,7 +288,9 @@
{
unsigned long ret;
unsigned long bx = (bus << 8) | device_fn;
+ unsigned long flags;
+ save_flags(flags);
__asm__("lcall (%%esi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -292,7 +309,9 @@
{
unsigned long ret;
unsigned long bx = (bus << 8) | device_fn;
+ unsigned long flags;
+ save_flags(flags);
__asm__("lcall (%%esi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -303,6 +322,7 @@
"b" (bx),
"D" ((long) where),
"S" (&pci_indirect));
+ restore_flags(flags);
return (int) (ret & 0xff00) >> 8;
}
@@ -311,7 +331,9 @@
{
unsigned long ret;
unsigned long bx = (bus << 8) | device_fn;
+ unsigned long flags;
+ save_flags(flags);
__asm__("lcall (%%esi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -322,6 +344,7 @@
"b" (bx),
"D" ((long) where),
"S" (&pci_indirect));
+ restore_flags(flags);
return (int) (ret & 0xff00) >> 8;
}
@@ -330,7 +353,9 @@
{
unsigned long ret;
unsigned long bx = (bus << 8) | device_fn;
+ unsigned long flags;
+ save_flags(flags);
__asm__("lcall (%%esi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -341,6 +366,7 @@
"b" (bx),
"D" ((long) where),
"S" (&pci_indirect));
+ restore_flags(flags);
return (int) (ret & 0xff00) >> 8;
}
@@ -349,7 +375,9 @@
{
unsigned long ret;
unsigned long bx = (bus << 8) | device_fn;
+ unsigned long flags;
+ save_flags(flags);
__asm__("lcall (%%esi)\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
@@ -360,6 +388,7 @@
"b" (bx),
"D" ((long) where),
"S" (&pci_indirect));
+ restore_flags(flags);
return (int) (ret & 0xff00) >> 8;
}
5.3. Adaptec 152x, 151x, 1505, 282x, Sound Blaster 16 SCSI, SCSI Pro,
Gigabyte, and other AIC 6260/6360 based products (Standard)
Supported Configurations :
BIOS addresses : 0xd8000, 0xdc000, 0xd0000, 0xd4000, 0xc8000, 0xcc000, 0xe0000,
0xe4000.
Ports : 0x140, 0x340
IRQs : 9, 10, 11, 12
DMA : not used
IO : port mapped
Autoprobe :
Works with many boards with an installed BIOS. All
other boards, including the Adaptec 1510, and Sound Blaster16 SCSI
must use a kernel command line or compile time override.
Autoprobe Override :
Compile time :
Define PORTBASE, IRQ, SCSI_ID, RECONNECT, PARITY as appropriate, see Defines
kernel command line :
aha152x=<PORTBASE>[,<IRQ>[,<SCSI-ID>[,<RECONNECT>[,<PARITY>]]]]
SCSI-ID is the SCSI ID of the HOST adapter, not of any devices you
have installed on it. Usually, this should be 7.
To force detection at 0x340, IRQ 11, at SCSI-ID 7, allowing
disconnect/reconnect, you would use the following command line option
:
aha152x=0x340,11,7,1
Antiquity Problems, fix by upgrading :
1. The driver fails with VLB boards. There was a timing problem in
kernels older than revision 1.0.5.
Defines :
AUTOCONF : use configuration the controller reports (only 152x)
IRQ : override interrupt channel (9,10,11 or 12) (default 11)
SCSI_ID : override SCSI ID of AIC-6260 (0-7) (default 7)
RECONNECT : override target disconnect/reselect (set to non-zero to
allow, zero to disable)
DONT_SNARF : Don't register ports (pl12 and below)
SKIP_BIOSTEST : Don't test for BIOS signature (AHA-1510 or disabled BIOS)
PORTBASE : Force port base. Don't try to probe
5.4. Adaptec 154x, AMI FastDisk VLB, DTC 329x (Standard)
Supported Configurations :
Ports : 0x330 and 0x334
IRQs : 9, 10, 11, 12, 14, 15
DMA channels : 5, 6, 7
IO : port mapped, bus master
Autoprobe :
will detect boards at 0x330 and 0x334 only.
Autoprobe override :
aha1542=<PORTBASE>[,<BUSON>,<BUSOFF>[,<DMASPEED>]]
Notes:
1. BusLogic makes a series of boards that are software compatible with
the Adaptec 1542, and these come in ISA, VLB, EISA, and PCI
flavors.
2. No-suffix boards, and early 'A' suffix boards do not support
scatter/gather, and thus don't work. However, they can be made to
work for some definition of the word works if AHA1542_SCATTER is
changed to 0 in drivers/scsi/aha1542.h.
Antiquity Problems, fix by upgrading :
1. Linux kernel revisions prior to .99.10 don't support the 'C'
revision.
2. Linux kernel revisions prior to .99.14k don't support the 'C'
revision options for
· BIOS support for the extended mapping for disks > 1G
· BIOS support for > 2 drives
· BIOS support for autoscanning the SCSI bus
3. Linux kernel revisions prior to .99.15e don't support the 'C' with
the BIOS support for > 2 drives turned on and the BIOS support for
the extended mapping for disks > 1G turned off.
4. Linux kernel revisions prior to .99.14u don't support the 'CF'
revisions of the board.
5. Linux kernel revisions prior to 1.0.5 have a race condition when
multiple devices are accessed at the same time.
Common problems :
1. There are unexpected errors with a 154xC or 154xCF board,
Early examples of the 154xC boards have a high slew rate on one of
the SCSI signals, which results in signal reflections when cables
with the wrong impedance are used.
Newer boards aren't much better, and also suffer from extreme
cabling and termination sensitivity.
See also Common Problems ``#2'' and ``#3'' and ``Common Problems'',
``General Flakiness''.
2. There are unexpected errors with a 154xC or 154x with both internal
and external devices connected.
This is probably a termination problem. In order to use the
software option to disable host adapter termination, you must turn
switch 1 off.
See also Common Problems ``#1'' and ``#3'' and ``Common Problems'',
``General Flakiness''.
3. The SCSI subsystem locks up completely.
There are cases where the lockups seem to occur when multiple
devices are in use at the same time. In this case, you can try
contacting the manufacturer of the devices and see if firmware
upgrades are available which would correct the problem. As a last
resort, you can go into aha1542.h and change AHA1542_MAILBOX to 1.
This will effectively limit you to one outstanding command on the
scsi bus at one time, and may help the situation. If you have tape
drives or slow cdrom drives on the bus, this might not be a
practical solution.
See also Common Problems ``#1'' and ``#2'' and ``Common Problems'',
``Common Problems : SCSI System Lockups''.
4. An "Interrupt received, but no mail" message is printed on bootup
and your SCSI devices are not detected.
Disable the BIOS options to support the extended mapping for disks
> 1G, support for > 2 drives, and for autoscanning the bus. Or,
upgrade to Linux .99.14k or newer.
5. If infinite timeout errors occur on 'C' revision boards, you may
need to go into the Adaptec setup program and enable synchronous
negotiation.
6. Linux 1.2.x gives the message
Unable to determine Adaptec DMA priority. Disabling board.
This is due to a conflict on some systems with the obsolete
BusLogic driver. Either rebuild your kernel without it, or give
the BusLogic driver a command line option telling it to look
somewhere other than where your controller is configured. Ie, if
you have an Adaptec board at port 0x334, and nothing at 0x330, use
a command line option like
buslogic=0x330
7. The system locks up with simultaneous access to multiple devices on
a 1542C or 1540C and disconnection enabled
Some Adaptec firmware revisions have bugs. Upgrading to BIOS