1. Technical Field
The present application relates generally to an improved data processing system and method. More specifically, the present application is directed to a method for discovering and isolating failure of high speed traces in a manufacturing environment.
2. Description of Related Art
Over the past decade, a transition has taken place as to the preferred method for implementing high throughput data links. Traditionally, high speed interfaces over relatively short distances were implemented using wide parallel buses, such as peripheral component interface extended (PCI-X), which contains a 64-bit wide data bus. More recent implementations use high speed serial links, such as Fibre Channel or serial attached SCSI (SAS), which usually only contain two bidirectional differential high speed pairs. In order to get the same data throughput as the wide parallel buses over a serial interface, the speed at which the data is transferred is dramatically increased, with recent speeds for Fibre Channel reaching 8 GHz and SAS reaching 6 GHz. This increase in speed presents vastly different challenges for testing in a manufacturing environment as compared to the wide parallel buses, which may only run at 133 MHz, as an example.
The typical measurement for determining if a high speed differential serial interface is acceptable is bit error rate (BER). Allowable limits for BER may be one error in 1012 data bits. Most system designs have margin designed into them that greatly surpass the 1×10−12 BER, which makes testing too long to be feasible for the manufacturing environment. Existing methods simply employ wrap back testing with attenuators to reduce the designed-in margin. The problem with this methodology is that it does not allow for component variation.
As an example, a typical serializer/de-serializer (SERDES) transmitter may be specified to have a maximum differential output of 1.0 V, while the minimum is specified for 600 mV. A typical SERDES receiver may be specified to have a minimum input amplitude of 200 mV. With these example numbers, an attenuator may be set to have a 12 dB attenuation, which would roughly reduce the transmitter amplitude by three. For a “worst case” transmitter of 600 mV, the signal would be reduced to 200 mV so that any manufacturing defects can easily be discovered and isolated. Any smaller defects in trace, solder quality, or components, such as blocking capacitors, will reduce the signal below the minimum value. However, a more typical or even a “best case” transmitter may still have enough margin on the signal so that manufacturing defects are not easily spotted leading to latent field failure.
As a specific example, a card may pass manufacturing tests with a solder short on a connector grounded half of a differential pair, essentially reducing the output amplitude by a factor of two. There may be enough margin still in the transmitter such that even after running in single ended mode and being attenuated by 12 dB in the manufacturing environment, no error is detected. At some time much later in the life of the card, a failure may be caused by the short in a customer environment.
The illustrative embodiments recognize the disadvantages of the prior art and provide a mechanism for discovering and isolating failure of high speed traces in a manufacturing environment. The mechanism utilizes transmit pre-emphasis and receiver equalization in combination with attenuated wrap plugs to enhance discovery and isolation of manufacturing defects in the manufacturing environment. The mechanism adjusts pre-emphasis and equalization in real time in high speed devices, allowing for much greater variation to compensate for design margins and specification variances. While the card is under test with wrap-backs installed, the pre-emphasis and receiver equalization are brought to the limits while logging the bit error rate to a non-volatile memory element. The mechanism then compares the bit error rate information to empirically derived signatures for failure isolation.
In one illustrative embodiment, a computer program product comprises a computer useable medium having a computer readable program. The computer readable program, when executed on a computing device, causes the computing device to create one or more signatures for devices with known hard error injects. Each signature within the one or more signatures comprises combinations of settings and error rate information for each combination of settings. The computer readable program further causes the computing device to vary settings on a device under test to test the device under test with a plurality of combinations of settings, monitor error rate for the device under test, log each combination of settings with corresponding error rate information, compare the logged combinations of settings and error rate information with the one or more signatures, and identify a faulty component or circuit based on the comparison.
In one exemplary embodiment, creating one or more signatures for devices with known hard error injects comprises varying settings on the given device to test the given device with a plurality of combinations of settings. The given device has a given hard error injected therein. Creating one or more signatures further comprises monitoring error rate for the given device, logging each combination of settings with corresponding error rate information for the given device, and storing the combination of setting and corresponding error rate information as a signature for the given hard error.
In a further exemplary embodiment, the settings on the given device comprise transmit pre-emphasis. In a still further exemplary embodiment, the settings on the given device comprise receiver equalization. In another exemplary embodiment, the error rate information comprises a measured bit error rate.
In one exemplary embodiment, the settings on the device under test comprise transmit pre-emphasis. In another exemplary embodiment, the settings on the device under test comprise receiver equalization. In a still further exemplary embodiment, the error rate information comprises a measured bit error rate.
In another illustrative embodiment, a data processing system comprises a processor and a memory coupled to the processor. The memory contains instructions which, when executed by the processor, cause the processor to create one or more signatures for devices with known hard error injects. Each signature within the one or more signatures comprises combinations of settings and error rate information for each combination of settings. The instructions further cause the processor to vary settings on a device under test to test the device under test with a plurality of combinations of settings, monitor error rate for the device under test, log each combination of settings with corresponding error rate information, compare the logged combinations of settings and error rate information with the one or more signatures, and identify a faulty component or circuit based on the comparison.
In one exemplary embodiment, creating one or more signatures for devices with known hard error injects comprises varying settings on the given device to test the given device with a plurality of combinations of settings. The given device has a given hard error injected therein. Creating one or more signatures further comprises monitoring error rate for the given device, logging each combination of settings with corresponding error rate information for the given device, and storing the combination of setting and corresponding error rate information as a signature for the given hard error.
In a further exemplary embodiment, the settings on the given device comprise transmit pre-emphasis. In a still further exemplary embodiment, the settings on the given device comprise receiver equalization.
In another exemplary embodiment, the settings on the device under test comprise transmit pre-emphasis. In yet another exemplary embodiment, the settings on the device under test comprise receiver equalization. In still another exemplary embodiment, the error rate information comprises a measured bit error rate.
In a further illustrative embodiment, a method for detecting and isolating a failure in a high speed device comprises creating one or more signatures for devices with known hard error injects. Each signature within the one or more signatures comprises combinations of settings and error rate information for each combination of settings. The method further comprises varying settings on a device under test to test the device under test with a plurality of combinations of settings, monitoring error rate for the device under test, logging each combination of settings with corresponding error rate information, comparing the logged combinations of settings and error rate information with the one or more signatures, and identifying a faulty component or circuit based on the comparison.
In one exemplary embodiment, creating one or more signatures for devices with known hard error injects comprises injecting a given hard error into a given device, varying settings on the given device to test the device under test with a plurality of combinations of settings, monitoring error rate for the given device, logging each combination of settings with corresponding error rate information for the given device, and storing the combination of setting and corresponding error rate information as a signature for the given hard error.
In another exemplary embodiment, the settings comprise transmit pre-emphasis. In yet another exemplary embodiment, the settings comprise receiver equalization. In still another exemplary embodiment, the error rate information comprises a measured bit error rate.
These and other features and advantages of the present invention will be described in, or will become apparent to those of ordinary skill in the art in view of, the following detailed description of the exemplary embodiments of the present invention.
The invention, as well as a preferred mode of use and further objectives and advantages thereof, will best be understood by reference to the following detailed description of illustrative embodiments when read in conjunction with the accompanying drawings, wherein:
Referring to the figures,
With reference now to
Turning to
With reference now to
Turning to
Those of ordinary skill in the art will appreciate that the hardware depicted in
In accordance with an illustrative embodiment, various hard errors may be injected into a card, such as a switch module.
Test fixture 310 may include a processor, P, and a memory, M, for executing the test. Test fixture 310 may load instructions into the memory, M, for execution on the processor, P. These instructions may control the test of the devices with the hard errors injected therein and the devices under test. Furthermore, during monitoring, settings and the BER information may be stored in the memory, M.
For instance, a common manufacturing problem with high speed interfaces is cold solder joints on the blocking capacitors that sit inline on the high speed traces. For this first step, a cold solder joint is created on one of the capacitors, and the card is plugged into the test fixture. The different combinations of pre-emphasis and equalizations are tested, and a signature for this type of failure is recorded. The results may be that a pre-emphasis of 12-14 and a receiver equalization of 2-5 might be the typical combination to catch cold solder joints, because higher pre-emphasis creates faster edge rates, which should pull out capacitor problems.
Another example of an error inject may be a printed circuit board (PCB) defect that causes cross talk on high speed networks. Again, the error is injected, and combinations of pre-emphasis and equalization are tested. The mechanism stores a signature for this failure. The results should be that one transmit pair with maximum pre-emphasis, which creates no cross talk potential, and a different receiver pair with maximum equalization, which reduces signal to noise ratio, catches the failure. Other error injects may be solder shorts and trace imperfections, for example. The data logged in this step are used as signatures for detection and isolation in a real testing process.
The next step is to test actual devices to discover and isolate failures in high speed traces in the manufacturing environment.
Test fixture 410 may include a processor, P, and a memory, M, for executing the test. Test fixture 410 may load instructions into the memory, M, for execution on the processor, P. These instructions may control the test of the devices with the hard errors injected therein and the devices under test. Furthermore, during monitoring, settings and the BER information may be stored in the memory, M.
Accordingly, blocks of the flowchart illustrations support combinations of means for performing the specified functions, combinations of steps for performing the specified functions and program instruction means for performing the specified functions. It will also be understood that each block of the flowchart illustrations, and combinations of blocks in the flowchart illustrations, can be implemented by special purpose hardware-based computer systems which perform the specified functions or steps, or by combinations of special purpose hardware and computer instructions.
Furthermore, the flowcharts are provided to demonstrate the operations performed within the illustrative embodiments. The flowcharts are not meant to state or imply limitations with regard to the specific operations or, more particularly, the order of the operations. The operations of the flowcharts may be modified to suit a particular implementation without departing from the spirit and scope of the present invention.
With reference now to
Then, the tester determines whether more combination of pre-emphasis and receiver equalization settings remain to be tested (block 510). If more combinations remain, operation returns to block 504 to vary pre-emphasis and equalization settings. If no more combinations of pre-emphasis and equalization settings remain to be tested in block 510, the tester stores the settings and error rate information as a signature for the hard error (block 512).
Thereafter, the tester determines whether more hard error types are to be tested (block 514). If more hard error types remain to be tested, operation returns to block 502 where the tester injects a hard error into a device and operation repeats for the new hard error. If there are no more hard error types to test in block 514, operation ends.
Then, the tester determines whether more combination of pre-emphasis and receiver equalization settings remain to be tested (block 608). If more combinations remain, operation returns to block 602 to vary pre-emphasis and equalization settings. If no more combinations of pre-emphasis and equalization settings remain to be tested in block 608, the tester compares the settings and error rate information with signatures for known failures (block 610). If the settings and error rate information reasonably matches with a signature for a known failure, the tester identifies the faulty component or circuit within the device under test (block 612). Thereafter, operation ends.
Thus, the illustrative embodiments solve the disadvantages of the prior art by providing a mechanism for discovering and isolating failure of high speed traces in a manufacturing environment. The mechanism utilizes transmit pre-emphasis and receiver equalization in combination with attenuated wrap plugs to enhance discovery and isolation of manufacturing defects in the manufacturing environment. The mechanism adjusts pre-emphasis and equalization in real time in high speed devices, allowing for much greater variation to compensate for design margins and specification variances. While the card is under test with wrap-backs installed, the pre-emphasis and receiver equalization are brought to the limits while logging the bit error rate to a non-volatile memory element. The mechanism then compares the bit error rate information to empirically derived signatures for failure isolation.
It should be appreciated that the illustrative embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In one exemplary embodiment, the mechanisms of the illustrative embodiments are implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
Furthermore, the illustrative embodiments may take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer-readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The medium may be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read-only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems and Ethernet cards are just a few of the currently available types of network adapters.
The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.