Can Predictive Analytics Help Reduce Workplace Risk?

Posted on by Gregory R. Wagner, M.D.

“Prediction is very difficult, especially if it’s about the future.” — Niels Bohr



  • Text message to chemical plant manager: Chlorine leak expected on line 2 tomorrow. Inspect and repair.
  • High priority email and automatic call to coal mine superintendent: 83% chance of roof fall on section 4. Evacuate immediately and take corrective actions.
  • Monthly notice to OSHA regional administrator: HIGH PRIORITY INSPECTION ROSTER: Firms listed below have a greater than 80% probability of violations reflecting hazardous conditions requiring mitigation.

Science fiction or the next logical step in workplace prevention?

Predictive analytics—the application of recently-developed analytic techniques to build, test, refine, and apply algorithms in an effort to, as Eric Siegel famously wrote, “predict who will click, buy, lie, or die” is a burgeoning field [Siegel 2013].  The availability of “big data” [Mayer-Shoneberg 2013]—vast quantities of information characterized by its volume, velocity, and variety—and the computing power to process the information rapidly has opened up innumerable possibilities for both private sector and public benefit.

Predictive analytics [PA] is at work each time Amazon informs a customer they may be interested in a particular product; each time Netflix recommends a movie; when life insurance companies risk-stratify and set rates for potential policy holders; and when some credit card companies target potential customers with specific incentives to acquire their card. Harrah’s casino has been using predictive analytics to promote customer loyalty. UPS is said to employ predictive analytics to analyze sensor data to target fleet maintenance more economically and to adjust delivery routes to reduce traffic-related delays and fuel consumption.* Software predicting whether airline ticket prices will rise or fall in the next week can guide purchasing decisions on line.

The public sector is also benefiting from predictive analytics [Davenport 2008]. Analytics are being employed by an increasing number of police departments, including those in Los Angeles and Santa Clara, CA, to make specific patrol assignments in order to improve crime prevention. While this raises the specter of the anticipatory (and occasionally misdirected) “murder prevention” arrests portrayed in the 2002 movie Minority Report—concerns about the current approaches to “predictive policing” have been limited.

Health care, with the exploration and adoption of “personalized medicine” is employing PA to tailor health screening and modify treatments to those most likely to succeed in a specific individual. And some health insurers are using PA to identify potential high-need/high-use beneficiaries in order to provide more intensive home-based services.

The political process is diving in as well. Electoral strategists are mining data to focus and reduce the cost of their “get out the vote” efforts in order to concentrate on people most likely to support their candidate.

Safety and Heath Applications

NIOSH has been exploring the potential application of predictive analytics and related approaches to reducing risk of death, injury, and disease from work. Clearly, there is tremendous potential for improved prevention if accurate predictions of injury and disease probability are possible. It seems likely that if injuries can be predicted accurately, they can be prevented. Work environments can be modified, maintenance performed, and workers trained, and improved protections offered. Multi-site employers could focus on their efforts on sites where injury or death is most probable. It is also likely that the agencies that inspect workplaces and enforce protective regulations—OSHA and MSHA—would be more effective and efficient if they could direct their efforts to workplaces where the risk of injury or death is high.

While traditional epidemiology searches for the determinates of disease and injury over time in populations, PA focuses on the prediction of events or effects in individuals or other affected “units” (such as particular production lines or workplaces) during a specific time window. Epidemiology is used to establish exposure—response relationships using careful measurements of both exposures and health effects while controlling for population variability. PA often uses available historical data reflecting the endpoint of interest, for example a five year history of mine injuries reported to MSHA, then divides the data into a training and test set. Using “machine learning” approaches, an algorithm is developed from the test set using a wide range of available, potentially relevant data, to fit the test set data. That algorithm is then applied to the test set of data to assess how well the algorithm predicts the results, and then is further refined if necessary. When a good algorithm is developed, it can be applied as new data are gathered. In this example, an algorithm that identifies mines where serious injuries are likely to occur could stimulate operators to adopt preventive practices and also help direct mine inspectors.

The data potentially relevant to predicting injury or disaster are plentiful: prior safety experience; a worker’s age and time in a specific job; time during a shift and hours worked during the prior day, week, or month; geographic location; how recently a workplace inspection occurred, and what the results were; season; enterprise profitability; the presence of an injury prevention program; union representation of the workforce; and on and on.

Barriers to Implementing PA in the Workplace

But there are many barriers to employing predictive analytics to improve occupational safety and health. Knowledge, skills, and attitudes of employers or workers may stand in the way. Some large workplaces have large quantities of relevant data available but do not know what to do with it. Others may have the potential to analyze and react to their data, but they may lack the motivation to apply analytics beyond sales and marketing. Also, small and medium sized enterprises may not have either the data relevant to their specific worksite or the technical resources to utilize what they do have. There may be privacy concerns that limit access to relevant data—concerns that the data will be used for other less socially responsible purposes.  And some employers may believe that they are doing just fine without the benefits of prediction—that the potential benefit isn’t worth the effort.

Prediction depends on the availability of information of adequate quality and consistency; access to trained analysts who have the tools to do their work; and an ability to frame questions and identify situations that are likely to benefit from prediction. That said, many of the thorniest problems faced in occupational safety and health may challenge the most sophisticated predictive analytic approaches. Low probability, high impact events such as oil rig failures and airplane crashes may be so rare as to defy prediction. But it is likely that other workplace problems from long-haul truck crashes to chemical line leaks could be anticipated and prevented by employing existing sensor technologies and predictive analytics.

What’s Next?

NIOSH is aware of at least one consulting firm that is marketing prediction to reduce workplace injury; others who are working to identify “leading indicators” that can be measured and are associated with good OSH performance; and a handful of others who are employing PA in private industry and academia with an OHS/prevention focus. We would appreciate hearing from you: (1) What circumstances could benefit from the application of predictive analytics? (2) Do you know of situations where predictive analytics is being used to improve workplace health and safety? (3) What are the barriers to adoption of advanced analytics for OHS? (4) What can be done to overcome these barriers?

We hope to hear from many of you and will look forward to sharing what we learn.


Gregory R. Wagner, M.D.

Dr. Wagner is Senior Advisor to the NIOSH Director 


*References to products or services do not constitute an endorsement by NIOSH or the U.S. government.

Sources & Resources

Ayres I [2007]. Super Crunchers: Why Thinking-by-Numbers Is the New Way to Be Smart. Bantam Books, New York, NY.

Davenport TH & Harris JG [2007]. Competing on Analytics: The new science of winning. Harvard Business School Press, Boston, MA.

Davenport TH & Jarvenpaa SL [2008]. Strategic Use of Analytics in Government. IBM Center for the Business of Government, Washington, DC.

Mayer-Schoneberger V & Cukier K [2013]. Big Data: A revolution that will Transform How We live, Work, and Think. Houghton Mifflin Harcourt Publishing Co, New York, NY.

Siegel E [2013]. Predictive Analytics: the Power to Predict Who Will Click, Buy, Lie, or Die. John Wiley & Sons, Inc., Hoboken, New Jersey.

Silver N [2012]. The Signal and the Noise: Why So Many Predictions Fail—But Some Don’t. The Penguin Press, New York, NY.

Posted on by Gregory R. Wagner, M.D.

29 comments on “Can Predictive Analytics Help Reduce Workplace Risk?”

Comments listed below are posted by individuals not associated with CDC, unless otherwise stated. These comments do not represent the official views of CDC, and CDC does not guarantee that any information posted by individuals on this site is correct, and disclaims any liability for any loss or damage resulting from reliance on any such information. Read more about our comment policy ».

    Hi all:

    I think the start point is not valid (Reduce Risk). I think the correct start point is “Simplify manage risk”…ant then minimize their consecuences…

    Thanks for the comment. Ultimately, the goals of occupational safety and health interventions are to prevent disease, injury, and death from workplace conditions and to improve the health and wellbeing of the workforce. Better risk management may involve absolute risk reduction by identifying working conditions or materials that need to be redesigned or where less hazardous materials can be substituted. (This approach is central to the NIOSH Prevention Through Design initiative described at The question I tried to raise in the posting is whether Predictive Analytics [PA] might be useful in this process and whether readers have examples of the application of PA to problems faced by OHS professions.

    It is unable to predict everything, only few things are predictable by analysing the situation. For example building roof problems are not predictable, roofs are checked by the experts for leaking problems. We are an orlando roof repair and replacement company in Florida providing roof repair and replacement services.

    You make a good point: Predictive analytics [PA]can be used to help identify equipment, vehicles, or processes that are at higher risk of failure, but the confirmation that there is a problem often requires further assessment and confirmation by an expert. It is possible, for example, that a property manager with many buildings to oversee may be able to use PA to identify which of the buildings is at highest risk of roof failure or leakage based on age, roofing materials, weather conditions, etc., and deploy human experts to evaluate the roofs more efficiently. PA can sort out levels of risk but prediction is not expected to be absolutely accurate.

    Many PA tools exist already in the hospital and healthcare insurance industry and are used to great success to predict health service use and to anticipate claims based on a number of demographic factors (e.g. age, sex, gender) as well as a patient’s co-morbidities.

    It is somewhat surprising that such analytic tools do not yet exist in the Occupational Safety & Health sector though perhaps this is changing as OSH becomes more orientated towards total health and wellness.

    In my opinion, PA would be a welcome addition to the OSH field and I would be interested to hear about existing tools or places where PA is being used successfully.

    I have been using a formula to demonstrate how a preventive ergonomics process over time minimizes the filing of WRMSD workers’ compensation claims and then develop an ROI from the formula. I use pareto analysis to determine the # of preventive cases that likely would have gone to a medical only wc case or indemnity case if not for the preventive ergonomics process. While it is general and may slightly over-estimate, when compared to real time cases, I see those break out at 70%/30% or so. I have had the formula reviewed by academia and HFES peers over the years with no further feedback to change. Recently, I received some feedback that my formula is too general and over-estimates… but as you say, predictive analytics is challenging and I think my point is well taken when presented; prevention pays! I’d love to get your feedback Dr. Wagner.

    In your article, you stated:

    We would appreciate hearing from you: (1) What circumstances could benefit from the application of predictive analytics? (2) Do you know of situations where predictive analytics is being used to improve workplace health and safety? (3) What are the barriers to adoption of advanced analytics for OHS? (4) What can be done to overcome these barriers?

    1. As identified in your article, leading indicators for safety and health events are being developed. A limitation I have practically found comes from Dr. Deming’s writings (Out of the Crisis, page 479) “Engineers often predict accidents. Their predictions are uncanny for correctness in detail. They fail in only one way –they can not predict exactly when the accident will happen.” We can observe behaviors and conditions and use other leading indicators to correlate to the likelihood of certain events occurring, but there is still a random feature that some of the “Big Data” folks ignore. For example, if I run a red light, it is not guaranteed that the next time I do that I will get in an accident. But if I habitually run red lights, I likely will end up in an accident eventually.

    A System of Profound Knowledge: SPC Trending Primer

    2. Yes, we are making use of leading indicators at the Savannah River Site of the US Department of Energy on a routine basis. It does help to keep our injury rates low. And we do work to find our changing conditions, where risk is changing. Unfortunately, if you successfully predict that we need to do some work here to prevent an event, and lo and behold you don’t have the event, how do you know that the prediction was good or not?
    Choosing Performance Indicators: SPC Trending Primer
    Control-Chart Dashboards

    3. Barriers are the current state of MBA degree programs for managers. I know since I’ve taught those courses, and read the (intended) curricula. Managers are trained to set targets, use moving averages, and otherwise “Kick Ass and Take Names”. They do not believe there is a random element that needs to be understood. They believe only that statisticians make life difficult.
    Injury Goal Setting

    4. Education. Also, Dr. Deming pointed out that statistical knowledge is very rare. We waste that talent in the United States more often than not. One shortcoming of the “Big Data” people is they assume that have 100% of the “population” of data and so can ignore random effects. That may be good enough for sending me Kroger coupons. However, not even the most sophisticated computer has 100% of the population, as the population must also include the future, which is unknown, but indeed predictable. Statistical Process Control techniques offer a good compromise between some of the black box analytical tools, and providing a message understandable to decision makers.
    Statistical Process Control Information: SPC Trending Primer

    I hope you have found some of these materials interesting and appreciate the opportunity to offer an opinion on your questions.

    Thanks much for your response and the materials you attached.

    I agree that leading indicators and the application of any information that comes from the application of predictive analytics must take randomness into account. Nate Silver’s book (The Signal and the Noise: Why So Many Predictions Fail—But Some Don’t ) explores and explicates this point. In my view, PA and the application of leading indicators may be of use in risk stratification and focusing preventive action, but their true utility in reducing the probability of low frequency events (e.g., plane crashes, offshore oil rig explosions, etc) is far from clear. It is a dilemma of public health that injuries, diseases, or disasters successfully prevented cannot be counted. Nonetheless, trends over time may be of use in estimating the effects of some interventions—if randomness is taken into account.

    I particularly appreciate the profound influence Dr. Deming has had on your thinking and approaches to workplace safety. Years ago I attended one of the last public training seminars Dr. Deming led. It was an incredible experience that has had a continuing influence on my own work.

    Very thought provoking discussion. I think that somehow the ability to tie PA into leading indicators could be the magic bullet OHS professionals are looking for. Leading indicators promote the right actions toward incident prevention but I am not aware of a robust study that has actually correlated leading indicators with desired outcomes – low incident/injury rates. Have you thought of having some dialog with businesses like the SAS institute in Cary, NC that are leaders in providing PA software solutions.

    Nice post Gregory and I definitely think that prediction in this area is possible to some extent.

    My experience in this area has shown me that there are some very telling indicators of who may be more at risk of workplace injuries than others and they might not be indicators that you’d expect.

    For example:

    1) Workers that always show up on time for work are less likely to sustain workplace injuries.

    We put that down to the fact that these people are always “trying to do the right thing”, meaning they’re more likely to follow safety procedures and use safety equipment as it was intended to be used.

    Due to that, we recommend employers and safety officers pay extra attention to those that don’t keep to the correct start, end and break times or otherwise show less adherence to company policies.

    2) Workers who have been injured are more likely to sustain additional injuries.

    We initially believed this was predominantly because injured parts of the body were more prone to re-injury but further investigation showed there was a closer correlation to employee attitudes and some staff obviously give safety procedures lip service at best and take short cuts as and when they’re convenient.

    3) We found that men were more prone to injury than women.

    Again, this appeared to be due to a lack of adherence to policies and recommendations (for example, men were more likely to try and lift unsafe weights than women).

    4) We found that staff who took more sick leave for general ailments like cold and flu were more likely to report injuries.

    We put that down to false reporting of injuries to gain time off work.

    Anyway, it was a very interesting analysis we did and the key message to me was that you can predict the higher risk groups and the way to manage those groups was through management of the safety culture of the organisation in general.

    For example, at [company name removed], every person on site has the job title of “Safety Officer” with their other duties considered secondary. It’s all about focus.

    Thanks again for the post.

    Karen Baker
    Safety Officer

    Thanks for your comment. I am curious about the basis for your conclusions: Would you be willing to share a link to the referenced data?

    Can Predictive Analytics Help Reduce Workplace Risk? or Can Predictive Analytics Help the Optimization of Workplace Risk? Which is more suitable?

    The Health and Safety Laboratory, the UK Health and Safety Executive’s scientific arm, is currently looking at how predictive analytic techniques can be used to help asset intensive, major hazard organisations predict the occurrence of operation critical events such as key asset failures. Operators in the oil and gas and electricity distribution sectors in particular, are becoming increasingly aware that predictive analytic approaches can help them shift their asset management systems away from the traditional reactive (scheduled, break-fix) maintenance type regime, and more towards a proactive (condition-based, preventive) maintenance type regime, i.e. where major outages are avoided in the first place rather than fixed after they happen.

    Prescriptive analytic data can be used to reduce risk, limit liability, and improve human performance. One of the problems is using authoritative data that can be easily verified. The BLS database may be flawed but records generated by OSHA and/or NIOSH may be a good starting point. Most medical and some insurance databases are also flawed for obvious reasons.
    The algorithm should be constructed based on empirical information and statistical analysis of both leading and lagging indicators of a safety management system. Everyone within the committee must agree on what these leading and lagging indicators should be and the hazards and maximum level of risk in order to build the correct model for all industries. A key ingredient is the human element that includes some psychological profiles and consideration for psychosocial disorders.
    If the data and algorithm can be modeled correctly for all business, the outcome could be a very useful risk management tool to determine which industries, processes, operations, and/or work tasks would be rewarded for taking proactive approach to occupational health and safety initiatives and which would not receive a significant return on investment. If everyone agrees on a similar approach, there is a lot of hard work ahead and rewards could be substantial in an ever shrinking world economy. This kind of critical thinking could validate the need for more health and safety professionals to support business operations and governance.

    I’ve been looking at this area very closely and it has a lot of potential. I want to make a comment. There’s a slight difference between ‘predictive analytics’ and ‘analytics’. The first, if I have it right, is a sub-branch. There is just as much value and perhaps more in seeing patterns that already exist in the data but are hard to see, as there is trying to predict the future.

    Thanks much for sharing your observation. I agree with the distinction you are drawing between the more general term, analytics, and the subset that is being called predictive analytics and also agree that both may have substantial utility in improving workplace prevention practices. Analytics have been in use for a long time to identify trends and “hot spots” and to attempt to determine the causes of injury and disease that may benefit from preventive interventions. I have been able to identify very few examples of the application of prediction to OHS, although it seems likely that there could be an important role.

    Big data was valued at $27.36 billion in 2014, depicting a compound annual growth rate (CAGR) of 40% as compared to the previous year. Industries leveraging big data and analytics solutions can maintain a competitive advantage over their peers with the insights derived from it.

    Finding quality information about this specific subject that’s clear and fascinating is hard. I just wanted to inform you it is a wonderful unique post and I also accept many of the viewpoints you’ve mentioned.


    Predictive analytics is an awesome addition to workplaces and will really help prevent more injuries and everything will be good to go.

    The Health and Safety Laboratory, the UK Health and Safety Executive’s scientific arm, is currently looking at how predictive analytic techniques can be used to help asset intensive, major hazard organisations predict the occurrence of operation critical events such as key asset failures. Operators in the oil and gas and electricity distribution sectors in particular, are becoming increasingly aware that predictive analytic approaches can help them shift their asset management systems away from the traditional reactive (scheduled, break-fix) maintenance type regime, and more towards a proactive (condition-based, preventive) maintenance type regime, i.e. where major outages are avoided in the first place rather than fixed after they happen.

    I appreciate the emphasis on the ethical considerations and responsible use of data in this blog post.

Post a Comment

Your email address will not be published. Required fields are marked *

All comments posted become a part of the public domain, and users are responsible for their comments. This is a moderated site and your comments will be reviewed before they are posted. Read more about our comment policy »

Page last reviewed: August 18, 2020
Page last updated: August 18, 2020