                   PCI EIDE CONTROLLER FLAWS REV 20
                                   
                                   
revised 1996 November 1 by Roedy Green


Summary of Recent Changes

1)   The RZ-1000 and CMD-640 flaws have still not been fixed in new
versions of the chips.

2)   Windows 95 may no longer be immune to the flaws.  There are a
couple of unconfirmed reports of failure after the Service Pack was
applied.

3)   Intel's CtrlTest to check for both the RZ-1000 and CMD-640 chips
is now available under the name RZtest.exe. Beware! old versions of
the MS Word documentation contains a macro virus. The virus was
removed in October 95. See
http://www.intel.com/procs/support/rz1000/index.html

4)   There is a mysterious new patch for 640B for Warp called
640x_v20.zip dated Sept 1, 1995. Its source is unknown, however it
appears to work, and work faster than Fixpack 10. It is probably the
code the CMD company wrote. I have had one report that it was
available via FTP at Phen.techhouse.brown.edu, but I have never been
able to get through myself. I have another report it was not there.
CMD has an OEM BBS for its customers, but it is not open to the
public. I could not find it on the public BBS at (714) 454-1134,
however I did find 640X_USR.403 which contains a variety of patches
for various operating systems.

5)   Art Scott (scotta@pilot.msu.edu) suggests that you can sometimes
tweak the performance of the RZ-1000 back up by configuring the
setting in advanced BIOS for the maximum number of cycles that a PCI
device can hold onto the PCI bus before the next board gets a turn,
from 66 to 33.

6)   According to John Blenkinsop (jblenkin@ccs.carleton.ca)
WFWIN10.ZIP is now available to update the install diskettes to the
Fixpack 10 level. It includes the RZ-1000 and CMD-640 fixes, but does
not automatically install the CMD fixes. See
ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-
us/wfwin10/wfwin10.zip

7)   IBM heard about the RZ-1000 flaw back in June 1994, but dismissed
it as a "hardware error".

8)   According to lovergin@ens.lifl.fr, one retailer, La Cle
Informatique, in France is offering to replace the defective Vobis
motherboards it sold.

9)   EIDEtest 1.9 and CDTest 1.1 released. The only change is a
warning to run your tests with background execution configured on.

10)  Fixpack 10 contains the necessary fixes for Warp. Beware! There
are leaked, buggy copies of Fixpack 10 out on the net.

11)  PJ19409.zip has been changed. It now contains all the fixes
necessary for the RZ-1000 and for the CMD-640. Follow the installation
instructions carefully. If you just follow your nose, chances are you
will be worse off than you are now. This fix has been incorporated
into Fixpack 10.

12)  Intel contradicts itself on the performance hit from disabling
prefetch to bypass the flaws. Robert Schultz
(robert.schultz@execnet.com) reports a 50% performance hit after
applying the CMD-640 fix. Marco Trunzer (ujjm@rzstud1.rz.uni-
karlsruhe.de) reports a 15% slowdown. There are still no benchmarks on
the effects on background bus-intensive processes.

13)  Dell is upgrading its XPS 90 to avoid the flawed chips, but they
are keeping the old kiss of death name.

14)  Micron P5-90 M54Pi-N 11P has flawed CMD 640 chip on the primary
channel, but a working SMC chip on the secondary channel. By moving
your EIDE devices to the secondary channel, you can avoid the flawed
chip.

15)  The precise mechanism of failure for both the RZ-1000 and CMD-640
is now understood. The RZ-1000 has two different flaws and the CMD-640
has five. In addition most motherboard manufacturers using these two
chips hooked them up improperly.

16)  SMC 37650 controller is probably ok.

17)  NT 3.5 not immune after all. It handles the RZ-1000 but not the
CMD PCIO 640. Fix is available.

18)  Software from IBM and Intel to detect both faulty chips directly.

19)  Explanation of what "Intel Inside" means.

20)  Dell offers upgrade BIOS to turn off the prefetch buffers.

21)  List of safe and unsafe operating system software.

22)  IBM hardware is clean.

23)  Stonewall rebuilds. Intel recants on offer to replace defective
motherboard.

24)  Problem is showing up under Windows For WorkGroups in 32-bit
mode.


Introduction

There are serious flaws affecting about 1/3 of all PCI motherboards.
The flaws affect any motherboard or EIDE controller paddleboard
containing the PC-Tech RZ-1000 PCI EIDE controller chip or the CMD
PCIO 640 PCI EIDE controller chip.

The flaws affect motherboards from ASUSTeK, AT&T, DEC, Dell, Gateway,
Intel, Micron, NEC, Zeos and others. Since Intel makes so many of the
motherboards sold under other brand names, the flaws affect many
machines, both 486 and Pentium PCI.

The flaws show up most frequently when you run a true multitasking
operating system such as OS/2 Warp or NT. It also shows up under
Windows For WorkGroups in 32-bit mode during tape or floppy backup and
restore. In theory the flaws could do damage under DOS, DESQview,
Windows and Windows For WorkGroups in 16-bit mode, but so far there
have been no damage reports. Windows-95 contains code to bypass the
flaws.

The RZ-1000 has two flaws. The CMD-640 has those same two flaws plus
three others. To make matters worse, most motherboard manufacturers
using these two flawed chips connected them up incorrectly. There are
software bypasses for these flaws. However, the Warp fix the CMD-640
reduces disk performance by 15 to 50%. The RZ-1000 fix has negligible
impact on disk I/O though it can slow down background processes.

I would advise new hardware to bypass the CMD-640 flaws, and living
with software fixes to bypass the RZ-1000 flaws.


What are the symptoms?

When you are using an IDE or EIDE hard disk attached to the EIDE
motherboard port, the flaws subtly corrupt your files by randomly
changing bytes every once in a while. The flaws introduce bugs into
EXE files, subtle errors into your spreadsheets, stray characters into
your word processing documents, changes to the deductions in last
year's tax return files, and random changes to engineering design
files.

This corruption happens when you are simultaneously using your EIDE or
IDE hard disk and some other device, most commonly the floppy drive or
mag tape backup.

The same sorts of problem may occur on reading a CD-ROM drive attached
to an EIDE port.


Is it Serious?

These flaws are nasty. They are causing hundreds of times more havoc
than the infamous Pentium divide flaw ever did. "I am Pentium of Borg.
You will be approximated."

Not only does this corruption occur, but it occurs quietly, often
going unnoticed.

If the system crashes, you usually put the blame on the operating
system software, or the application. It might actually be a faulty RZ-
1000 or CMD-640 EIDE controller chip nailing you.

When a directory becomes corrupted, you may not notice it until the
damage is irreparable. If a spreadsheet application reads a comma-
delimited ASCII file, it may simply miss a few bytes in a number, an
error that may go unnoticed, and that error could cascade through the
rest of the spreadsheet.

If you have had unexplained crashes in OS/2, you have probably
experienced the problem, and should make a thorough check for hidden
corruption. Remember that the bug may only slightly alter your data,
and the corruption may not be obvious.

Keep in mind that not every problem is the RZ-1000's or the CMD-640's
fault. Overheating, unrelated hardware faults and design flaws, or
software bugs can cause similar symptoms. DMA channel conflicts also
cause similar symptoms. Happily, EIDEtest and CDTest can unmask all
manner of simultaneous I/O faults.

Unfortunately, correcting the problem just stops further file
corruption. It will not help to clean up the existing damage to your
files. Right now, the focus is on bypassing the flaws. Preventing
further corruption is child's play compared with the nightmare of
trying to track down all the existing random errors in files. Backups
even from day one may be corrupt. If you have the either of the flawed
chips, you will probably never be able to completely eliminate the
effects of past corruption.


How Do You Tell If You Have The Flawed Chips?

There are four categories of motherboard:

1)   Definitely safe. Motherboards may still have flaws, but all
software in use bypasses them.

2)   Probably safe. In theory there could be problems, but no one has
reported any so far.

3)   Possibly dangerous. You will have to run EIDEtest, CDtest, or
IOTest to find out.

4)   Probably dangerous. You will still have to run the tests to find
out for sure.

Definitely Safe

Definitely safe includes older machines with ISA. EISA, or MCA buses.
The flaws only affect machines with the new PCI bus or the VESA VL
bus. PCI machines that use the new Triton chipset from Intel do not
have the flaws

PCI machines with Intel BIOSes that run only DOS, DESQview, Windows
3.1 or Windows-95 are safe. If you have a non-Intel BIOS and run only
DOS, DESQview, Windows 3.1, Windows-95 and never use the "fast mode"
simultaneous disk I/O feature on floppy or tape backup/restore, you
are safe.

You still might want to test your machine. There are similar problems
with other causes the tests will unmask.


Probably Safe

If you have a non-Intel BIOS and run only DOS, DESQview, Windows 3.1,
or Windows for WorkGroups 3.11 in 16-bit disk access mode, you
probably will not see the problem, even though you may have one of the
faulty chips.


Possibly Dangerous

Most auxiliary chipsets (e.g., OPTI Viper, SMC, Mercury and Neptune)
used on PCI motherboards do not include a built in EIDE controller.
Such motherboards use a separate EIDE controller chip -- often the
flawed RZ-1000 or CMD-640. If you use a separate no-name EIDE
paddleboard, it will likely use the one of the flawed chips. In
theory, the flaws could affect DOS, Windows, and Windows For
WorkGroups with 16-bit disk access during floppy/tape backup and
restore, though no one has reported problems yet. Windows For
WorkGroups with 32-bit disk access is dangerous if you have the flaws.


Probably Dangerous

PCI Motherboards (both 486 and Pentium) with the older Mercury and
Neptune chipsets are likely to have the flawed chips. The Mercury
chipset was popular in P60 and P66 systems, and the Neptune in P70,
P90 and P100 systems. Mercury chipsets are labelled with an MX suffix
and Neptune with NX. If you are using NT, OS/2 Warp or Linux, you are
likely to have already experienced extensive file corruption if either
of the flawed chips are present. Check the list later in the article
for motherboards known to carry the flawed chips.


Testing For The Flaws

Scot Llewelyn, one of the eight authors of PowerQuest's
PartitionMagic, discovered one of the RZ-1000 flaws and made it
public. Prior to that, only employees of PC-Tech, Intel and Microsoft
were aware of how to bypass the flaws. In the process of tracking the
RZ-1000 problems down, Internet comp.os.os2.bugs participants
discovered a second flawed chip, the CMD-640.

Scot did most of the initial work documenting the first RZ-1000 flaw.
He wrote a program called IOtest that can detect the flaws if:

1)   You are using OS/2 Warp.

2)   You are willing to go through the hassle of creating a separate
small partition to run the test. You can use his program,
PartitionMagic, to make room to create one.

3)   You have an EIDE hard disk attached to your EIDE port. It cannot
detect the problem if you only have an EIDE CD-ROM, or if the EIDE
port is currently unused.
Scot originally called his test program DMAtest because he erroneously
thought simultaneous DMA was the sole culprit. Do not confuse
PowerQuest DMAtest with Gazelle's DMAtest which only tests if the
floppy drive will work happily simultaneously with the hard disk.

The world needed an easier-to-use test that would run under DESQview,
Windows, Windows For WorkGroups, Windows 95, NT and OS/2. So I wrote
EIDEtest to test for the flaws without requiring you to create a
special partition or buy Warp OS/2. I also wrote CDTest to test for
the flaws when you have an EIDE CD-ROM drive.

You can also get both programs from me by snail mail.

If these tests fail, it proves you have a serious problem, but not
necessarily that you have the RZ-1000 or CMD-640 chip.

If the tests pass, you still may have a problem since, especially
under DOS, DESQview and Windows, the flaws may only show up very
rarely. If you run the tests under Windows-95 they will always pass,
even if you have a defective chip, because the operating system
already bypasses the flaws. If you suspect trouble, run the tests
several times.


Visual Inspection

You can also have a look at your motherboard. Between the PCI slots,
at the edge of the motherboard, look for a rectangular chip about 1 by
2 cm (0.5" x 0.75") that says RZ-1000 near the top of the chip. There
are variations on the chip name, e.g., "RZ-1000BP". Unfortunately, the
markings are not always present, especially in ASUSTeK motherboards
which may have the "CMD PCIO 640A" or "CMD PCIO 640B" chip. As of
October 1995, all versions of the RZ-1000 and CMD-640 are defective,
even new ones.


Direct Tests

The OS/2 Warp Bonus Pack Sysinfo version 3.02 utility (the upgraded
downloaded version) will report on your EIDE controller. The signature
for the RZ-1000 looks like this:

manufacturer: PC TECHNOLOGY INC

class code : 0001

Vendor ID: 1042

Device ID: 1000

Revision ID: 0001



For the CMD-640B it will look like this:

manufacturer : CMD TECHNOLOGY INC

class code : 0001

Vendor ID :1095

Device ID : 0640

Revision ID : 0002



The Warp disk driver IBM1S506.ADD with the /V switch will tell you if
you have the RZ-1000 or CMD-640 chip.

Intel has written a new test that looks directly for either of the two
faulty chips called CtrlTest.exe, however it is filed under its old
name RZTest.exe.

The Windows-95 Control panel will also report on the EIDE controller
chip.


Where Have Flaws Been Found?

Via email, on BIX and on the Internet and in comp.os.os2.bugs, people
have reported finding flaws in the following specific motherboards.

Motherboard          Chip     Reporters
                              
Acculogic VL         CMD-640  Mark Lord (mlord@bnr.ca)
Paddleboard                   tentative
                              
Acer Power P75       CMD-640  John Harvey, Beta Machinery
                              Calgary
                              
ACMA P590            ?        Bob Smith
                              
AST Bravo MS-T P/75  CMD-640  Mike Coplien
                              (kcoplien@facstaff.wisc.edu)
                              
ASUSTeK PCI/I        CMD-640  Marco Trunzer
P54SP4                        (ujjm@rzstud1.rz.uni-
                              karlsruhe.de)
                              
                              Maurice Schekkerman
                              (schekker@prl.philips.nl)
                              
                              Mike Coplien
                              (kcoplien@facstaff.wisc.edu)
                              
                              Robert Schultz
                              (robert.schultz@execnet.com)
                              
                              Thomas L. Kusterer
                              (kustetl1@aplcomm.jhuapl.edu)
                              
AT&T Globalyst 590   RZ-1000  Brian Myrick
                              (brian@jagonet.com)
                              
AT&T Globalyst 600   RZ-1000  Brian Myrick
                              (brian@jagonet.com)
                              
AT&T Globalyst 630   CMD-640  Mike Coplien
                              (kcoplien@facstaff.wisc.edu)
                              
CMD CSA-62101Kx VL2  CMD-     George Voros
IDE paddleboard      640B     (george.voros@ghbbs.com)
                              
Compaq Presario      CMD-640  Walter Wu
                              (wu000016@mc.duke.edu)
                              
Compaq Prolinea      CMD-640  Walter Wu
                              (wu000016@mc.duke.edu)
                              
DEC Celbris 590      CMD-640  Fred Thomsen
                              (fthomsen@lexis.pop.upenn.edu)
                              
DEC Starion 700I     CMD-640  Mike Coplien
                              (kcoplien@facstaff.wisc.edu)
                              
DEC Venturis 466     CMD-640  Mike Coplien
                              (kcoplien@facstaff.wisc.edu)
                              
DEC Venturis 560     CMD-640  Fred Thomsen
                              (fthomsen@lexis.pop.upenn.edu)
                              
Dell Dimension XPS   RZ-1000  Scot Llewelyn
P100                          (scotl@itsnet.com)
                              
Dell Dimension XPS   RZ-1000  Steve Ertman
P75                           (sertman@ocean.fsu.edu)
                              
Dell Dimension XPS   RZ-1000  Dong Chen (D_Chen@netcom.com)
P90                           
                              Larry Lai (lai@iastate.edu)
                              
                              Lawrence Rounds
                              (ljrounds@netcom.com)
                              
                              Mike Griggs (mpg@iadfw.net)
                              
                              Mike Heath
                              (heath@rohan.sdsu.edu)
                              
                              Moira Watson
                              (watson6@uwindsor.ca)
                              
                              Nathaniel Beck @weber.ucsd.edu
                              
                              Pete (pag@interramp.com)
                              
                              Shallenberg
                              (bobshall@subtone.wanet.com)
                              
                              Wijadi Jodi
                              (r2nw@dax.cc.uakron.edu)
                              
Dell Optiplex 575    CMD-640  Mike Coplien
                              (kcoplien@facstaff.wisc.edu)
                              
Dell Optiplex XM     CMD-640  Aron Eisenpress
590                           (afecu@cunyvm.cuny.edu)
                              
Dell XPS-133c        neither  Blake Scholl (bscholl@one.net)
                              
EliteGroup S154P-    CMD-640  Ulf Volz (volz@student.uni-
AIO                           kl.de)
                              
EliteGroup UM8810P-  CMD-640  Bodo Huckestein (bh@thp.Uni-
AIO                           Koeln.DE)
                              
                              Guy Kapteijns
                              (W.Kapteijns@kub.nl)
                              
Escom P5/60          CMD-640  Detlef Meier
(Intel Premiere               (detlef.meier@materna.de)
ATLX)                         Rogier van Wanroij
                              (wanroij@cs.utwente.nl)
                              
Escom P60I           CMD-640  Tim Schofield
                              (schofieldt@logica.com)
                              
Escom P90            RZ-1000  Karl Knoflach
                              (151579kk@student.eur.nl )
                              
                              (Xav@mantra01.demon.co.uk)
                              
Gateway 2000 P5-60,  RZ-1000  Angus Black
Intel Mercury Rev 3           (angus@spanner.hiway.co.uk)
                              
                              Gary Farr
                              (garyfarr@ix.netcom.com)
                              
                              Daron Davis
                              (daron_davis@dca.com)
                              
                              Jerry Lynch (lynch.94@osu.edu)
                              
                              Keith Patterson
                              (dinosaur@buffnet.net)
                              
                              Rick Gregory
                              (rfg@us.dynix.com)
                              
                              Roy L. Smith
                              (smittyry@ix.netcom.com)
                              
Gateway 2000 P5-66   RZ-1000  Randy Nerwick
                              (nerwick@netcom.com)
                              
Gateway 2000 P5-90   RZ-1000  Alan Murphy (alan@jac.co.uk)
                              
                              Roy L. Smith
                              (smittyry@ix.netcom.com)
                              
Gigabyte GA586-AP)   CMD-640  Yacov Jegher
ALI chipse                    (jegher@accent.net)
                              
HP Vectra 590        CMD-640  Javier Vizcaino
                              (jvizcain@msn.com)
                              
Intel Hendrix        CMD-640  Clif Purkiser Intel Corp
                              (support@cs.intel.com)
                              
Intel Insight P5-60  RZ-1000  Jim Arnone
Premiere PCI II               (arnone@primenet.com)
Baby AT, Neptune              
Chipset

Intel Plato 90       RZ-1000  Adrian Teo
                              (adriant@singnet.com.sg)
                              
                              Alain Rassel
                              (Alain.Rassel@restena.lu)
                              
                              Chris Norman
                              (cnorman@oboe.aix.calpoly.edu)
                              
                              Clif Purkiser Intel Corp
                              (support@cs.intel.com)
                              
                              Kevin Chua
                              (chua@server.uwindsor.ca)
                              
                              Kevin T. Van Maren
                              (vanmaren@cs.utah.edu)
                              
                              Kim Hvarre
                              (kims@crash.ping.dk)
                              
                              Martin Kogelbauer
                              (e8826847@student.tuwien.ac.at
                              )
                              
                              Rick Nelson
                              (rnelson2@ccmail.unl.edu)
                              
                              Richard Techmanski
                              (richt@netcom.com)
                              
Intel Premiere       RZ-1000  Clif Purkiser Intel Corp
                              (support@cs.intel.com)
                              
Intel Premiere LPX   CMD-640  Clif Purkiser Intel Corp
                              (support@cs.intel.com)
                              
Intel Premiere MM    CMD-640  Clif Purkiser Intel Corp
                              (support@cs.intel.com)
                              
Intel Robin LC       CMD-640  Clif Purkiser Intel Corp
                              (support@cs.intel.com)
                              
Knowledgebase P90    CMD-640  Andy Longton
laptop                        (alongton@clark.net)
                              
Micron P5-90         CMD-640  Primary fails, secondary is
                              OK.
                              
                              Eric Johnson
                              (johnson@scripps.edu)
                              
                              Jim Short
                              (jdshort@primenet.com)
                              
                              Mike Coplien
                              (kcoplien@facstaff.wisc.edu)
                              
Micronics M54Pi      CMD-640  Adam Haar
                              (s9406709@yallara.cs.rmit.edu.
                              au)
                              
Midwest Micro P90    CMD-640  (412d25$e8j@clarknet.clark.net
                              )
                              
NEC Image P90        CMD-640  Mike Coplien
                              (kcoplien@facstaff.wisc.edu)
                              
Packard Bell Legend  CMD-640  James Treworgy
100CD                         (jamie@access.digex.net)
                              
PCI-EIDE local       CMD-640  (whelk@ios.com)
clone, Phoenix BIOS           
4.04, ALI chipset

Quantex P5/90 PM-2   RZ-1000  Jay Schamus
                              (jaylord@rcinet.com)
                              
S1366 PCI EIDE       CMD-     Ross Fleming
paddeboard           640B     (rossflem@serv.net)
                              
Scandic UMC          CMD-     Daniel Spangberg
VIO8810A             640B     (daniels@kemi.uu.se)
                              
Soyo SY-4SA2 486     ?        Jeffrey Hurwit
prior to B5                   (jhurwit@netcom.com)
                              
Tagram SQ-588        CMD-640  Kurt Krasinski
                              (kurt.krasinski@aquila.com)
                              
Unknown 486 DX       SMC3765  Eric Stephen Mountain
                     0        (esm1@oak70.doc.ic.ac.uk )
                              
Unknown 90 MHz       ?        Andreas
                              (abenamou@galaxy.csc.calpoly.e
                              du)
                              
                              Carol Lim (law30185@nus.sg)
                              
Viglen P90 (Intel    RZ-1000  Phil Buckley
Plato)                        (phil@starbug.swstyle.co.uk)
                              
Vobis                RZ-1000  Thomas Wagner
                              (twagner@bix.com)
                              
Vobis 4886DX2-66     CMD-640  Guy Kapteijns
                              (W.Kapteijns@kub.nl)
                              
Zenon P90            RZ-1000  Aria Novianto
                              (novianap@cs.purdue.edu)
                              
ZEOS Pantera         RZ-1000  Paul Whitelock
                              (paulw9DDFL3r.DDI@netcom.com)
                              



Known Good Motherboards

The following motherboards have been tested with EIDEtest or CDtest
and found to be ok. Not to worry, there are many more good boards than
I have listed here:

Motherboard      Chip    Reporters
                         
Arsys P200-PCI   Triton  Robert Aboud
                 /sis    (raboud@pacific.telebyte.c
                         om)
                         
ASUSTek PCI/I-   Triton  Roedy Green
P54TP4                   (Roedy@bix.com)
                         
Dell Dimension   ?       Note: older versions of
XPS P90c                 this board were flawed.
                         
                         Dave Nuttall
                         (dnuttall@texas.net)
                         
Intel Zappa      Triton  Ron McGlade
                         (ronmc@primenet.com)
                         
Micronics 486    ?       Bob Meredith
VLB                      (meredith@interactive.net)
                         
Seanix           Opti    Bill Unruh
                 Viper   (unruh@physics.ubc.ca)
                         
Soyo SY-4SA2     SYS     Jeffrey Hurwit
486/B5                   (jhurwit@netcom.com)
                         



What Can You Do If You Have A Flaw?

1)   Pester the manufacturer. Unfortunately, the EIDE controller chips
are soldered in. The only way to repair a flaw is to replace the whole
motherboard, recycling the socketed chips -- the CPU, DRAM and SRAM
cache. It would be very expensive for computer and motherboard
manufacturers to fix a flaw.

After a month of stonewalling, Dell has announced it will offer a BIOS
upgrade to turn off the prefetch buffers.
According to lovergin@ens.lifl.fr, one retailer, La Cle Informatique,
in France is offering to replace the defective Vobis motherboards it
sold.

You can contact Dell at support@us.dell.com or (800) 624-9896.
Intel is now acknowledging the problem. For a short while, Intel
offered to replace defective motherboards, then they reneged. You can
contact them at support@cs.intel.com or call their tech support line
(800) 628-8686. Select options 1-3-1. You can find international
contact numbers at: http://www.intel.com/intel/intelis/contact.html.

You can call ASUSTeK at (408) 956-9077.
Call PC-Tech at (612) 345-4555.
Call CMD Technology at (714) 454-0800, (800) 426-3832 or (714) 455-
1656 FAX.

2)   Buy a new unpopulated Triton PCI motherboard and recycle the CPU,
DRAM and SRAM cache chips from the old motherboard. Unfortunately, the
Triton chipset has design shortcuts that hamper performance in
simultaneous I/O situations. At least they don't corrupt data.

3)   Run the controller in degraded mode. Some BIOSes have a feature
disable the EIDE prefetch buffer. Vendors may offer a BIOS upgrade to
allow you to manually disable prefetch. The BIOS may also turn it off
automatically if either of the defective chips is present. This will
bypass both RZ-1000 flaws and two of the five CMD-640 flaws. Art Scott
(scotta@pilot.msu.edu) suggests that you can sometimes tweak the
performance of the RZ-1000 back up by configuring the setting in
advanced BIOS for the maximum number of cycles that a PCI device can
hold onto the PCI bus before the next board gets a turn from 66 to 33.

4)   Buy a PCI EIDE paddleboard controller such as the DTC 2130S, the
Tekram 290N/290S, the Promise 2300+ or the BusLogic BT-910 to replace
the one on the motherboard. You must disable the EIDE controller on
the motherboard. This fix will waste one of your precious slots. Be
careful. You could be leaping out of the RZ-1000 frying pan into the
CMD-640 fire since paddleboards often use the CMD-640.

5)   Buy a SCSI hard disk and CD-ROM, and avoid using the EIDE ports
entirely. Under OS/2 and Linux, SCSI gives better performance, but
costs more. DOS, Windows, Windows For WorkGroups and Windows-95 are
unable to exploit the advanced features of SCSI, but at least avoid
the EIDE flaws when you go pure SCSI.

6)   Find a software work-around. There are fixes for Warp to bypass
all the flaws in the RZ-1000 and CMD-640. Fixpack 10 is the first
fixpack to bypass the flaws. Now that Intel and IBM have finally
revealed the technical details, all the operating system writers can
patch their EIDE drivers to bypass the flaws. There are also fixes for
NT 3.1 and 3.5. See below for details.

7)   Get a BIOS upgrade. For DOS, DESQview, and Windows 3.1, to bypass
the flaws you may need a new BIOS -- an EPROM chip. If you have a
flash BIOS, you can update it simply by downloading a file. Most
BIOSes already have code to bypass the flaws for DOS, DESQview and
Windows. However, more advanced operating systems bypass the BIOS, so
even a smart BIOS will not protect you. However, the BIOS CMOS
settings may allow you to disable prefetch, which also protects you
even in true multitasking operating systems.

8)   Cut the trace. Cut the trace on the motherboard from the floppy
changeline to the EIDE controller. However this just bypasses one of
the CMD-640's five flaws and one of the RZ-1000's two flaws.

9)   Use the Secondary EIDE Controller. Some motherboards such as the
Micron P5-90 M54Pi-N 11P use different kinds of controller on the
primary and secondary EIDE ports. The primary may be flawed, but the
secondary OK.

Whatever method you use to bypass the flaws, retest with EIDEtest and
CDTest afterwards to be sure your fix worked and you caught all the
problems.


Cleaning Up The Mess

Once you have bypassed the flaws, you can start working the problem of
cleaning up your files.

The first thing to do is to re-install your operating system and all
your application programs. This will replace any damaged EXE and DLL
files.

Catching errors in your data files is more difficult. Keep your eyes
peeled for any improbable spreadsheet results. You may have to hire a
programmer to write you some comb programs to sniff through your
databases, looking for suspicious values.

If you routinely use the verify feature of Lotus Magellan, it can
detect changes to files that should not have changed. This may help
you uncover some of the damage. The flaws are not polite enough to
redate the files they corrupt.

If you have backups from before the time you bought the faulty
machine, you can restore them and re-key everything.

Most people will not be so fortunate. All their backups will also be
corrupt.

Most people with flaws will just have to put up with random errors
dotting their data files ever after.


Operating System Summary

Operating System   Work Around
                   
Netware            - No problems reported.
Unixware 1.1       
NEXTSTEP
Banyan
Solaris 2.4+
SCO Unix 3.1+
Windows-95

DOS                - No problems reported so far. If you do
DESQview           have trouble:
Windows 3.1        - Turn off EIDE prefetch in CMOS
                   settings.
                   - Upgrade BIOS chip.
                   - Turn off simultaneous disk/floppy/tape
                   I/O in your backup programs.
                   
Windows For        - Turn off 32 disk access mode.
WorkGroups         - Turn off EIDE prefetch in CMOS
                   settings.
                   - Upgrade BIOS chip.
                   - Turn off simultaneous disk/floppy/tape
                   I/O in your backup programs.
                   
Windows NT 3.1     - Turn off EIDE prefetch in CMOS
                   settings.
                   - Apply ATDISK.SYS fix.
                   
Windows NT 3.5     - Turn off EIDE prefetch in CMOS
                   settings.
                   - Apply the 640XNT35.ZIP fix.
                   
OS/2 2.1           - Disable prefetch buffer in CMOS
                   settings.
                   - Load the IBMINT13.I13 driver instead
                   of the IBM1S506.ADD driver. This trick
                   will only work if your BIOS has flaw
                   bypass code. It will be slow.
                   - Upgrade to Warp
                   
OS/2 Warp 3        - Apply Fixpack 10, it contains all the
                   special fixes.
                   
                   If for some reason, you are unwilling to
                   apply Fixpack 10, you can do the
                   following:
                   - Disable prefetch buffer in CMOS
                   settings.
                   - Apply the RZ-1000 portion of
                   pj19409.zip if you have the RZ-1000.
                   - Apply the CMD portion of pj19409.zip
                   including IBMIDECD.FLT if you have the
                   CMD-640.
                   - If that does not work, try
                   basedev=CMD640x.add /16BIT.
                   - In a pinch, if you cannot do either of
                   the first two things, add a line to
                   config.sys BASEDEV=IBMINT13.I13 and
                   remove the line BASDEV=IBM1S506.SYS. The
                   IBMINTI3.I13 Device driver lives in
                   C:\OS2\BOOT, and on the first install
                   diskette, and the on the CDROM in
                   \OS2IMAGE\DISK_1. This trick will work
                   only if your BIOS has flaw-bypass code.
                   It will be slow.
                   
Linux              - Disable prefetch buffer in CMOS
                   settings.
                   - To bypass the CMD-640 flaws use the
                   boot time kernel parameter:
                   hda=serialize.
                   - To bypass the prefetch flaws, use the
                   default settings to suppress interrupts
                   during I/O on the external Hard Disk
                   Parameter utility hdparm..
                   

Reporting Your Findings

Whether or not you find any flaws, please email me at Roedy@bix.com or
post the following information in the Internet newsgroup
comp.os.os2.bugs:

1)   Test results. (I would like to hear about both machines with and
without flaws.)

2)   Brand and model of your motherboard.

3)   Brand and model of your entire system.

4)   Which chip did you find, the RZ-1000, the CMD-640, the SMC 37650?
What did SYSINFO 3.02 report about your EIDE controller chip?

5)   Have you noticed data file corruption?

6)   Which tests and versions did you use? (IOtest, EIDEtest, CDtest,
RZtest, CtrlTest or visual inspection)

7)   What activities did you run in the background during the test?

8)   Which operating system and version you used to run the test (e.g.
Warp Connect blue spine)

9)   Which fixpacks and patches did you applied before running the
test?

10)  Brand and model of EIDE hard disk

11)  Brand and model of EIDE CD-ROM

12)  Markings on the suspect chip, e.g., "RZ-1000BP", "CMD PCIO640B",
"SMC 37650".

13)  Vendor's name

14)  Vendor's response on informing him of your problem.

Whose Fault Is It?

The wags will have fun tormenting Intel for using the flawed RZ-1000
and CMD-640 in its motherboard designs, even though Intel did not
manufacture either of the two faulty chips. Intel is not the only
company to manufacture motherboards with the faulty chips, but Intel
will bear the brunt of the bad publicity.

PC-Tech manufactured the faulty RZ-1000 EIDE controller chip used in
many PCI motherboards. PC-Tech is a subsidiary of ZEOS, the
clonemaker. In turn Micron Electronics owns ZEOS. PC-Tech has offices
just down the street from Zeos in Minnesota. Intel bought the chips
from PC-Tech, and in turn many clone makers bought motherboards from
Intel. Other motherboard manufacturers also used the faulty chips. In
a similar way Intel and other companies also used the CMD-640 chip
from the CMD Technology Corporation of Irvine California.

PC-Tech, Intel and the clone makers all failed to test their designs
properly. The software makers did not test their software on enough
machines to show up the problem before releasing it.

Even worse, in some motherboard designs, Intel used the CMD-640 chip.
This goof was inexcusable, since the chip, by deliberate design, is
incapable of simultaneous I/O.

How did the flawed CMD-640 chip and the RZ-1000 slip through Quality
Assurance testing? My guess is no one did real world testing;
technicians only tested under laboratory conditions using only simple
operating systems like DOS. They might have ignored flaws that
happened only sporadically, blaming it on a faulty chip rather than a
faulty design. It is very hard to catch a flaw that only manifests
rarely.

CMD, PC-Tech, Intel, and Microsoft have known about how to bypass
these problems for quite some time. IBM was aware there was a problem
but was unaware of the solution. For obvious reasons, these companies
were reluctant to inform the public of the danger of the ongoing
subtle corruption.

No one who understood the RZ-1000 and CMD-640 flaws publicised their
findings. If PC-TECH, Intel and Microsoft had not been so secretive,
they could have averted the damage. Perhaps they were silent because
the flaws primarily hurt the customers of competitor, IBM.

The collective damage done by withholding information about the flaws
is huge, certainly many millions of dollars for those large companies
whose backups are corrupt as well. It will be interesting to see if
anyone launches a damage lawsuit against CMD, PC-Tech, Intel or
Microsoft. If they do, it might make both hardware and software makers
more careful about releasing improperly tested products.

IBM is not totally innocent either. According to Massimiliano Vispi
(massiv@mix.it), on June 17, 1994, IBM posted a document:
http://ps.boulder.ibm.com/pbin-usa-
ps/pub_huic_getrec.pl?DVantero.swm.boulder.ibm.com+DBos2+DA22398+ST"H0
85835"+USPublic
that stated:

     "Another case has been where the PCTech chip RZ-1000 used
     for IDE operations on the PCI bus is in use (PJ15378). On
     Intel Pentium motherboards with PCI/IDE on board slot, data
     is sometimes lost. This is a hardware error. This is
     PJ15378."
     
Sam Detweiler of IBM explained that this referred only to the trailing
2 byte loss RZ-1000 problem. IBM was not aware of the concurrent
floppy problem with prefetch at that time.

Discussions with Intel and PC-Tech lead IBM to believe that re-writing
the interrupt handler to avoid reading the IDE status register
recursively would solve the problem. PC-Tech never did explain the
precise failure mechanism.

IBM says the CMD-640 problem also appeared in October 1994 with the
Vobis systems. CMD did not inform IBM of the problem.

Prefetch also affected the CMD chips (640, 640A and 640B). CMD built
their own driver based on IBM code to handle the serialisation
problem. They did not fix the prefetch problem in their driver so it
appears they too were unaware of it at this time.

There is potential here for some massive lawsuits. No wonder the
companies who knew about the flaws have been so tight-lipped. Think of
the damage if Boeing or GM had its plans for coming products stored on
flawed machines. Literally, these flaws could cause plane crashes.


Intel's Spin

There are three levels of "Intel Inside".

1)   Weak. Your motherboard has an Intel CPU but a support chipset
from another manufacturer.

2)   Medium. Your motherboard has an Intel CPU and Intel support
chipset such as the Neptune or Triton, but some other company built
the BIOS and motherboard.

3)   Strong. Your motherboard has an Intel CPU, Intel support chipset,
Intel motherboard and Intel BIOS.

Intel literature on the RZ-1000 and CMD-640 only refers to (3). Intel
cannot very well speak for (1) and (2) where the PCI EIDE controller
design was out of their control, even though these machines bear the
"Intel Inside" logo.

Intel does not make this distinction clear in their literature.

According to Intel, "This problem is a consequence of the RZ-1000's
inability to fully compensate for all the implications of running an
IDE hard disk as an extension of the PCI bus, instead of running as an
extension of the AT bus which it was originally designed to do."

Intel would have us believe the problems are flaws per se, but rather
a limitation that the programmers forgot to take into consideration.

The truth is grey. UART chips have similar flaws. Programmers have
gradually learned to code around them. We don't insist that all COM
port hardware be recalled. We now tend to blame a programmer if he
does not bypass the known UART flaws.

Given that software work-arounds are now possible, the primary blame
shifts for any perpetuation of the problem to the software authors.

However, there are many other EIDE chip designs that do not have this
"limitation". Since the chip are supposedly generic implementations of
the ATA interface standard, I cannot so lightly excuse these flaws.


Speculation

Because setting the flaws right would be so expensive, I suspect that
clone makers and motherboard manufacturers will continue to refuse to
replace the defective equipment. At best they may offer BIOS upgrades
to bypass the flaws. Microsoft has already added code to Windows-95 to
bypass the flaws. Clone makers will rely on software vendors to write
drivers that bypass the flaws for Warp, NT, Linux and the various
UNIXes.

Now that the OS/2 fixes are out, the pressure to set things right will
dwindle. Since DOS, Windows in 16-bit mode, Windows-95 are immune,
little pressure to correct the problem is likely to come from those
camps.

The motherboard manufacturer has five options:

1)   Replace the motherboard. Recalls on a mass scale would be
extremely costly for the motherboard manufacturers, so you can count
on them to fight. ($400 parts + $250 labour)

2)   Provide a replacement paddleboard EIDE controller that takes up a
PCI slot. ($75)

3)   Provide a new BIOS chip that bypasses potential problems for DOS
and Windows. The BIOS could also turn off prefetch which would rescue
multitasking operating systems that do not use the BIOS for I/O. ($10)

4)   Tell the users to upgrade to software that bypasses the flaws,
and to turn off simultaneous disk/tape/floppy I/O in any backup
software run under DOS, DESQview or Windows. Users won't like the
performance hit, however. ($0)

5)   Stonewall and refuse to even acknowledge the problem. This will
be more difficult now that Intel and Dell have publicly admitted the
problem. ($0)
Intel has already set the precedent by offering to replace defective
Pentiums, even though software can bypass its divide flaw. The RZ-1000
flaws are far more serious, and the CMD-640 flaws are even more
serious still.

Keeping this under wraps is going to be hard for the clone builders.
Brooke Crothers of Infoworld did several stories based on my
compilations. I have been in contact with Jerry Pournelle of Byte. I
sent email to John Dvorak. Even Dean Takahashi of the San Jose Mercury
Daily News did story. In the November 1995 editions, a 1000-word
abridged version of this essay appeared across Canada in The Computer
Paper and Toronto Computes. The stonewall is coming tumbling down. As
one individual pointed out, I read your postings on the Internet, and
see them the next day quoted in my daily newspaper.


What Are the Flaws?

IBM Confirmed the RZ-1000 has two different flaws:

1)   In prefetch mode, multi-sector reads often fail.

2)   The chip erroneously responds to floppy status commands and
corrupts hard disk or CD-ROM I/O in the process.

IBM confirmed the CMD-640 has five different flaws:

1)   It has the same prefetch problem as the RZ-1000.

2)   It has the same floppy status problem as the RZ-1000.

3)   It does not support simultaneous I/O on the primary and secondary
EIDE ports.

4)   Confusion over legacy and PCI mode.

5)   Does not support 32-bit writes.

The Flaws Under A Microscope

After the manner of Ionesco, Roedy Green said, "All great programmers
are paranoid." Programmers have to anticipate problems that could
happen only once in a trillion machine cycles since such problems
would still show up on average every three hours. EIDE problems
sometimes go days without manifesting. Sometimes they show up within
seconds, depending on the unrelated I/O activity in the machine.

I have read about ten conflicting explanations from authorities on the
cause of the problems. Much of the confusion comes because there are
so many different flaws -- all generating similar symptoms. I based
the following explanations on postings from Sam Detweiler of IBM's
Warp Device Driver section (sdetweil@vnet.ibm.com).

The RZ-1000 and CMD-640 both have the prefetch flaw and the floppy
status flaw. The CMD-640 has three additional flaws. I will focus on
the three most important.


Flaw 1: Prefetch Buffer Flaw

The RZ-1000 and CMD-640 both have the prefetch flaw.

Data moves from the hard disk to RAM via a bit bucket brigade. The RZ-
1000 grabs data 16 bits at a time from a buffer in the integrated
controller in the hard disk, and hands it off 32 bits at a time off to
the PCI bus. The CPU sits in a tight loop grabbing data from PCI bus
and storing it in RAM. In prefetch mode, the RZ-1000 keeps ahead of
the CPU, requesting two 16-bit chunks from the hard disk, in order to
have a 32-bit chunk ready when the CPU asks.

When you disable the prefetch buffer, you turn off the parallelism and
run in a degraded lock-step mode. In degraded mode, the RZ-1000 waits
until the CPU asks for a 32-bit chunk. Then it puts the CPU on hold
while it asks the hard disk for two 16-bit chunks. It glues them
together, and puts them on the PCI bus and allows the CPU to continue.

I advise all but the most dedicated technophiles to skip the next
paragraph.

If the RZ-1000 is running with prefetch enabled, it erroneously
considers a sector read complete as soon as it has grabbed the last 16
bits from the hard disk and stuffed it into the prefetch FIFO buffer.
It should not consider it complete until the CPU has stuffed all the
data into RAM. The RZ-1000 then starts to read the next sector. If the
current read operation is interrupted, or delayed by simultaneous DMA
from some unrelated device, before the last two bytes are read from
the FIFO, and the next sector is prefetched into the FIFO before the
current data transfer completes, then the chip will erroneously signal
yet another Data Available Interrupt. Because OS/2 has already
signalled EOI (End Of Interrupt) to the PIC (Programmable Interrupt
Controller) and enabled interrupts, it recurses into the disk driver
interrupt handler. The driver then reads the status register.
Unfortunately, because of a cheap design shortcut, the FIFO is used
both for data and status. The CPU reads the data in front of the
status as if it were the status. This causes the interrupted data
transfer to later read the following status as if it were data,
resulting in corruption. Both the RZ-1000 and CMD-640 fail in exactly
the same way.

There are two software techniques to bypass this flaw:

1)   Never schedule more than one I/O at a time. Use strict polled
mode with no interrupts. Turn off all unrelated interrupts during I/O.
This is the DOS/Windows approach. The disadvantage is poor performance
and possible lost incoming modem characters.

2)   Turn off the prefetch buffer. According to Intel and IBM, in a
lightly loaded system, there is sufficient spare capacity on the PCI
bus so running in degraded mode only slows the disk down by 1%.
However, programs making extensive use of the PCI bus such as LANs or
video bit-map painting will also slow down. Both Intel and IBM tell us
that turning off prefetch to bypass the flaw has negligible effect on
performance. Yet in the Plato BIOS rev 12, Intel says that enabling
the prefetch buffers will "significantly increase PCI IDE Hard Disk
performance." They can't have it both ways.


Flaw 2: Floppy Status

The RZ-1000 and CMD-640 both have the floppy status flaw.

This flaw is the result of an incredible chain of blunders.

The original MFM (the predecessor to IDE) interface design blunder was
using different bits of the same I/O port, 3F7, for two unrelated
purposes, detecting the floppy changeline and reporting hard disk
status. Modern EIDE controllers are no longer supposed to do this, but
some chips carry on in the old tradition and provide legacy logic.
Motherboard manufacturers then often blunder by attaching the floppy
changeline to the EIDE controller. This way both the EIDE controller
and the floppy controller think they are in charge of reporting floppy
changeline status. On top of that, the designers of both the RZ-1000
and CMD-640 chips both blundered by trying to save a little silicon by
using the same registers to store both hard disk status and data.

For the insatiably curious here is precisely how the corruption
occurs. Simultaneously I/Os to both the hard disk are floppy disk are
running. The floppy controller generates an I/O complete interrupt.
The floppy driver then check the floppy status. Part of reading floppy
status is checking the changeline bit -- contained in the ambiguous
port 3F7.

If the motherboard manufacturer goofed and hooked up the floppy
changeline to the EIDE controller, the RZ-1000 erroneously responds to
the floppy status request. It is in charge of the hard disk, not the
floppy. It is the floppy controller's job is to respond. The RZ-1000
feeds two data bytes from its FIFO out as floppy status. These data
were was supposed to go to the hard disk driver. Thus the chip loses
two bytes from the hard disk transfer, corrupting data. Turning off
prefetch also solves this problem. Unlike the first flaw, only
simultaneous floppy I/O start can trigger this problem. Simultaneous
I/O of any kind can trigger the first flaw.


Flaw 3: No Simultaneous I/O

Only the CMD-640 has this flaw. The CMD-640 can't do more than one I/O
at a time. This flaw was so obvious everyone found out about it long
ago. All EIDE controllers (even fully functioning ones) cannot run
master and slave simultaneously. However, two separate EIDE
controllers are supposed to allow primary and secondary channels to
run at once. The CMD-640 has dual controllers on one chip. However,
because of a lack of two register sets, the primary and secondary
channels will not work simultaneously unlike every other design. For
example, you can't run your EIDE hard disk and EIDE CD-ROM at the same
time.

Simultaneous I/O speed is the reason we put two EIDE devices on
separate channels, both as masters, rather than making one a master
and one a slave on the same channel.

IBM has a bypass for this blunder. When it detects a CMD-640, Warp
never schedules more than one I/O at a time when the CMD-640 is
active, reducing the operating system to DOS-like performance.
Independent experiments show the degradation from using the CMD fix is
15 to 50%.


Background

If you read the literature on this problem, you will see various
daunting technical terms. Here is a rough explanation.

There are six kinds of I/O used in PCs.

1)   PIO - Programmed I/O. The CPU spoon-feeds each byte to the I/O
port. The port can usually accept data as fast as the CPU can feed it.
Typical IDE drives work this way under DOS. For slower devices, the
CPU polls the status to see if the device is ready for yet another
byte.

2)   Scheduled I/O. This is a variant of PIO where the operating
system feeds the I/O device some bytes, then calculates how long it
should take for the I/O device to digest them, then it goes away for a
while to do something else, then it comes back when it figures the I/O
should be complete, and feeds the device a few more bytes. This is how
Warp usually controls parallel port printers.

3)   Interrupt I/O. Every time the port is ready to eat another byte,
it raises an interrupt and the CPU feeds it some more. This is the
typical way COM ports work and how Warp uses printers with the /IRQ
option. Warp EIDE drivers combine methods (1) and (2). The hard disk
interrupts when it has completed the read into its on-board buffer.
Then the CPU fetches data out of the buffer with PIO mode.

4)   Third party DMA. The DMA controller on the motherboard copies
data from RAM to the port and generates an interrupt when it is done
with a block. Floppy drives and inexpensive mag tape backup drives use
this method. Because of the unfortunate original AT design
compromises, this method is exceedingly slow. Third Party DMA is never
used for PCI bus devices though it is still used for ISA or
motherboard-based floppy controllers on PCI motherboards.

5)   First party DMA, sometimes called Bus Mastering. A DMA controller
on the device copies data from RAM to the port and generates an
interrupt when done High end SCSI cards -- such as the Adaptec 2940 or
2940W use this ultimate way to fly.

6)   Memory mapped I/O. The CPU copies data to a magic region of RAM
which is actually on the I/O device. LAN cards or REGEN VRAM on video
cards use this technique.
In a true multi-tasking system, such as OS/2, the CPU goes off and
works on behalf of applications when the port is busy, and trusts an
interrupt to bring it back when the device needs more service. It
schedules several I/Os simultaneously. In contrast, DOS and Windows
never do more than one I/O at a time. Further, under DOS/Windows the
CPU idles while waiting for its single I/O to complete rather than
working on applications.


Learning More

You can use the Internet to learn more about this problem. If you do
not have Internet access, I can provide you these files on diskette.
See below for details. When accessing files on the Internet generally
you must use lower case.


Test Programs
     
Roedy Green's EIDEtest and CDtest programs for DOS, DESQview, Windows,
Windows For WorkGroups, Windows 95, NT, OS/2 and Warp. They ensure
your hard disk and CDROM will function without interference from
background I/O activity. These indirectly detect the flawed RZ-1000
and CMD-640 chips. By the time you read this, I may have posted a
newer version.

     ftp://garbo.uwasa.fi/pc/diskutil/eidete19.zip
alternatively
     ftp://ftp.cdrom.com/.4/os2/incoming/eidete19.zip
or
     ftp://ftp.cdrom.com/.4/os2/sysutil/eidete19.zip

Intel's RZ-1000 and CMD-640 chip detect program. RZtest.exe expands to
form CtrlTest.exe. Beware! the CtrlTest.Doc documentation contains an
MSWord macro virus.

     http://www.intel.com/procs/support/rz1000/index.html
     
     or
     
     ftp://ftp.intel.com/pub/PCandNetworkSupport/Intel_News/$RZ1000.EX
     E
     
IOTest from PowerQuest, the makers of Partition Magic, a Warp test for
     the flaws.
     
     http://www.powerquest.com/download/iotest.zip
     
Version 3.02 of the self-extracting Warp utility, that should be
placed in OS2\APPS. SYSIGUI.EXE will emerge.

     ftp://ftp.software.ibm.com/ps/products/os2/fixes/v3.0warp/english-
     us/sitcsd/sysinfo.exe
     

Fixes
     
Warp Fixpack 10. This bypasses the flaws for both the RZ-1000 and CMD-
640 faulty EIDE chips. It also fixes numerous other bugs in Warp. It
comes as a set of six files file -- totalling about 8 MB. Make sure
you get it from an official IBM CSD site because there are leaked pre-
released buggy copies floating about the net. Before applying it,
verify that the readme.1st on the first fixpack disk is dated 9/21/95
at 17:40. The package as a whole should be dated 9/22/95 or later.
This fixpack applies to all versions of Warp including Warp Connect.
It contains in itself all earlier fixpacks. You don't need to apply
any previous fixpacks first. If you have the CMD-640, it is especially
important you carefully read the installation instructions. You need
to manually modify config.sys. DO A COMPLETE BACKUP FIRST. Many people
are having a variety of troubles with Fixpack 10 -- often traced to
failure to carefully follow the installation instructions, including a
COMMIT step.


ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-
us/xr_w010/xr_w010.1dk

ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-
us/xr_w010/xr_w010.2dk

ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-
us/xr_w010/xr_w010.3dk

ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-
us/xr_w010/xr_w010.4dk

ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-
us/xr_w010/xr_w010.5dk

ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-
us/xr_w010/xr_w010.6dk

alternatively

     ftp://ftp.pcco.ibm.com/pub/corrective_service/xr_w010.1dk
     ftp://ftp.pcco.ibm.com/pub/corrective_service/xr_w010.2dk
     ftp://ftp.pcco.ibm.com/pub/corrective_service/xr_w010.3dk
     ftp://ftp.pcco.ibm.com/pub/corrective_service/xr_w010.4dk
     ftp://ftp.pcco.ibm.com/pub/corrective_service/xr_w010.5dk
     ftp://ftp.pcco.ibm.com/pub/corrective_service/xr_w010.6dk

WFWIN10.ZIP. It updates the Warp install diskettes (for all released
versions) to the FixPak 10 level, including the RZ-1000/CMD-640 fixes.
However, it does not automatically install the CMD-640 files.


ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/english-
us/wfwin10/wfwin10.zip

Microsoft Windows NT 3.1 ATDISK.SYS fix for the CMD-640 chip:
     
     http://www.microsoft.com/KB/softlib/mslfiles/pciatdsk.exe
     
Microsoft Windows NT 3.5 fix for the CMD-640 chip:
     
     CMD's BBS at (714) 454-1134. File 640XNT35.ZIP
     
If you don't want to install the entire Fixpack 10, you can install
these Warp bypasses for the RZ-1000 and the CMD flaws. Warning. This
file has been updated several times without changing the name. Make
sure you get the most recent. The installation instructions are
tricky. Follow them carefully.

     ftp://service.boulder.ibm.com/ps/products/os2/fixes/v3.0warp/engl
     ish-us/pj19409/pj19409.zip
     
CMD fixes for various operating systems CMD-640 chip. Expand with
     PkUnZip -d 640X_USR.403
     
     CMD's BBS at (714) 454-1134. File 640X_USR.403
     
Warp bypass for the early CMD-640 chip flaws. It has been superseded
by pj19409.zip. You no longer need to install it before pj19409.zip.

     ftp://ftp-os2.cdrom.com/pub/os2/drivers/cmd640x.zip
     

Essays
     
Roedy Green's FAQ (Frequently Asked Questions) an unabridged copy of
this article in both Winword and ASCII format. BY THE TIME YOU READ
THIS, I MAY HAVE POSTED A NEWER VERSION named eidete21.zip, eide22.zip
etc.  You would not believe how many people reported being unable to
find the program simply because it had been replaced with a newer
version.

     ftp://garbo.uwasa.fi/pc/diskutil/eidete20.zip

PowerQuest essay:
     
     http://www.powerquest.com/
     
Intel's FAQ
     
     http://www.intel.com/procs/support/rz1000
     
PC-Tech's essay:
     
     http://www.mei.micron.com/rz1000/rz1000.txt
     
Catch Pat Duffy's (duffy@theory.chem.ubc.ca) essays each Sunday in:
     
comp.os.os2.misc, comp.os.os2.setup.misc, comp.os.os2.setup.storage
     and comp.sys.ibm.pc.hardware.misc
     
Check out Pat Duffy's Web site at:
     
     http://warp.eecs.berkeley.edu/os2/workbench/work.htm
and
     ftp://ftp.netcom.com/pub/ab/abe/


What If You Don't Have Internet Access?

If you send me $5 (US or Canadian) to cover duplication, postage to
anywhere in the world, and handling I will send you a diskette
containing the relevant test programs, fixes, Internet postings and
essays. Sorry, but for various reasons I do not provide this package
via EMAIL. See the address below.


Contacting the Author

The author, Roedy Green is a computer consultant who prefers to work
on Java, Forth, C++, Delphi, DOS, OS/2 and Internet Web projects.
(250) 285-2954.

I have been swamped with phone calls and Email from people who have
not yet read this essay. If you phone to ask a question already
covered in this essay I may be rather short with you. You may be only
asking for a few minutes of my time free, but it adds up so that I was
not been able to earn any money for four months.

Please report any machines with flaws. Send email to:

     Roedy@bix.com

or discuss this problem on the Internet newsgroup in:

     comp.os.os2.bugs.

You can also write via snail mail:


Roedy Green
Canadian Mind Products
POB 707 Quathiaski Cove
Quadra Island, BC CANADA V0P 1N0
telephone (250) 285-2954
Internet Roedy@bix.com

-30-

