Data mining: a case study

StatSoft Pacific Pty Ltd
http://www.statsoft.com.au
By Tony Grichnik & Mike Seskin, Caterpillar; Thomas Hill, StatSoft
Wednesday, 08 November, 2006


Caterpillar Inc has reduced rotating machinery anomalies by nearly 45%, thanks to improvements delivered by data mining methods.

Like many large companies, Peoria, Illinois-based Caterpillar has developed sophisticated data-collection techniques. A single manufacturing plant can use thousands of gauges, sensors and other automated devices to collect data from machines and other manufacturing equipment. But much of the massive amounts of data the company collected for monitoring hundreds of process parameters went unused.

Beyond the data was the dilemma of what to do to improve quality. What knobs get turned and in which direction? What design feature gets modified and by how much? What exactly in the manufacturing stream gets adjusted, altered, tuned or tweaked to positively affect quality and compliance outcomes?

Using predictive data mining on their mountains of data, Caterpillar's quality managers uncovered the manageable process parameters most important to the quality outcomes of finished products. Tribal knowledge, which contributed to process variation, was now consolidated and focused by providing a line of sight between the manufacturing processes and the finished product. By distilling the relevant relationships from a large candidate pool of data, the manufacturing process was fine-tuned to consistently produce a finished product that met engineering and customer requirements without added costs.

Engineers at Caterpillar are now deploying data-mining software developed with StatSoft Inc of Tulsa, Oklahoma, that answers questions about enterprise-wide quality control and improvement. To date, the software has helped Caterpillar's manufacturing and design engineers use empirical data for quality improvements.

When the software was implemented on a Caterpillar rotating assembly found in industrial machine equipment, the results were improved product quality and streamlined manufacturing processes that produced cost savings in many categories. When engineers set out to solve an intermittent vibration problem found in some engines, they applied the define, measure, analyse, improve and control (DMAIC) methodology to the problem.

During the 'measure' phase, engineers examined what data were useful to consider in relation to the vibration problem, which drove unnecessary cost by interrupting the finished product testing to 'trim balance' the rotating assembly. Here, the abundance of data included 113 different assembly features measured during the manufacturing process. During the 'analyse' phase, Caterpillar distilled these raw data down to a subset of predictor variables, such as clearances and fits in the rotating assembly that affect trim balance in an engine.

The data were categorised as input, output or constraints data, and the software then predicted a subset of variables that caused trim balance outcomes during finished product testing. To validate predictions, the software simulates the probability of trim balance problems. If the simulation results meet minimum criteria for accuracy and uncertainty for validation, the software then optimises the model to minimise trim balance.

Moving into the 'improve' phase, engineers used the information rendered in the software's Actionable Decision Environment to explore what-if alternatives. In Caterpillar's case, the model revealed six assembly features that could affect trim balance. Engineers explored scenarios for implementing cost-effective changes using the actionable decision environment and determined that making two of the changes would have a disproportionately high effect on reducing the frequency of trim balance problems.

The actionable decision environment indicated what actions would produce the desired outcome, allowing the Caterpillar team to determine that a reduction in run-out of two interacting features on the assembly would reduce [the occurrence of] trim balance problems by approximately 50%.

The software's simultaneous optimisation module effectively addresses more complicated situations with competing requirements. For example, industrial machinery typically has power, fuel efficiency and emissions requirements. The software finds the optimum balance among competing goals by giving ranges of inputs relative to the desired outcomes.

The results of using purposeful data mining for this particular Caterpillar rotating assembly included a 45% reduction in trim balance problems. Beside the typical reduction costs associated with finished-product discrepancies, such as reduced rework and scrap, and less visible effects such as increased throughput, a decrease in resources required to support testing also materialised. Additional return on investment (ROI) is derived from the fact that useful and actionable data were distilled. The investment in metrology equipment to collect data, storage and management applications for the data, and the added product cycle time to collect the data are just a few of the costs that are easily recovered when the data are converted to knowledge and useful actions.

Caterpillar reduces post-manufacturing product adjustments and effectively speeds time to market because it 'predicts' the manufacturing process parameters driving the need for expensive and time-consuming product adjustments. More ROI is achieved through cost savings from the reduction in personnel and time previously used to perform the adjustments.

Manufacturing-specific data mining

A breakthrough in Caterpillar's approach overcame a critical barrier in attempting to deploy data mining in manufacturing: making the recommendations from the specific predictive models relevant to practical implementation.

Quality process managers, manufacturing engineers, design engineers and other decision makers at Caterpillar manipulated process settings on virtual models to observe the effect on product quality outcomes. Interacting with the model helps participants see opportunities to add their collective expertise to the software predictions and derive reality-based settings for optimal quality results.

Included with the Actionable Decision Environment is a set of graphical tools to help users interact with the virtual process models. The interface is a mechanism that allows users to poke the model with what-if scenarios so that they can see the implications of their actions.

Using this approach, Caterpillar engineers were able to simultaneously improve multiple, competitive outcomes (eg, power vs fuel efficiency) driven by multiple upstream processes.

At the same time and equally important, the software helped Caterpillar identify processes where additional variation could be allowed with no penalty in product quality. The upside of the findings is simple: Why spend money and time to control a material or process variable when it doesn't contribute to product quality?

Originally published in Quality Digest (www.qualitydigest.com), September 2006.

Related Articles

Anticipating maintenance problems with predictive analytics

By utilising predictive analytics, process manufacturers can predict failures, enhance...

Air-gapped networks give a false sense of security

So-called 'air-gapped' OT networks can still fall victim to cyber attacks, so what is the...

Maximising automation flexibility: the ISV-driven approach

Vendor lock-in has long been a significant barrier to innovation in the industrial sector, making...


  • All content Copyright © 2024 Westwick-Farrow Pty Ltd