Prognosis outcomes - Diagnosis and Prognosis of Occupational disorders based on Machine Learn-

Table 5.6: Mined association rules from Metal Stamping area.

N Antecedents Consequents Supp Conf Lift

1 knee_R shoulder_L 0.065 1 4.0

2 knee_R wrist_L 0.062 1 3.2

3 wrist_L, knee_R shoulder_L 0.060 1 4.0

4 shoulder_L, knee_R wrist_L 0.060 1 4.0

5 knee_R wrist_L, shoulder_L 0.060 1 8.0

Table 5.7: Results of injury severity for the left shoulder.

Model RMSLE R2 MAE CV-R2

Decision Tree 1.0235 0.6112 0.1115 0.459 +/- 0.161 Random Forest 1.0025 0.8504 0.1175 0.709 +/- 0.158 Gradient Boosting 1.0187 0.7523 0.1412 0.684 +/- 0.163

LGBM 1.029 0.8735 0.1537 0.732 +/- 0.114

CatBoost 0.9753 0.8555 0.1604 0.749 +/- 0.144 XGBRegressor 1.0158 0.8179 0.1453 0.688 +/- 0.157

the highest value of R2 on the validation data. Because its boosting schemes help to reduce overfitting and improves the quality of the model.

Some of the main features of the CatBoost model are that even without parameter tuning the default parameters provide for great results, categorical features do not need preprocessing, quick computation, increase in accuracy with less overfitting, and lastly, efficient predictions. Therefore, it will have fewer errors than other models and it will be more accurate.

According to theR2criterion, theLightGBMmodel performed better than the other 5 models.LightGBMhas better accuracy than any other boosting algorithm, It produces much more complex trees by following a leaf-wise split approach rather than a level-wise approach which is the main factor in achieving higher accuracy. However, it can sometimes lead to overfitting which can be avoided by setting the max_depth parameter.

Considering theMAE metric, simpler tree-based models like theDTandRFcould achieve quite better results but all models’MAEvalues are remarkably acceptable.

Table 5.8: Results of injury severity for the right shoulder.

Model RMSLE R2 MAE CV-R2

Decision Tree 1.0256 0.1255 0.0965 0.417 +/- 0.119 Random Forest 0.997 0.6732 0.1968 0.66 +/- 0.034 Gradient Boosting 0.98 0.5518 0.238 0.653 +/- 0.042

LGBM 1.0311 0.7138 0.1929 0.716 +/- 0.055

CatBoost 0.9433 0.7255 0.266 0.701 +/- 0.073 XGBRegressor 0.975 0.5862 0.2405 0.648 +/- 0.019

The table5.8shows the results of body injuries for the right shoulder. According to theRMSLEcriterion, the CatBoost performed better than the other five models. It also has the highest value ofR2.

According to theR2criterion, theLightGBMmodel performed better than the other 5 models on validation. Since it is based on DTalgorithms, it splits the tree leaf-wise with the best fit whereas other boosting algorithms split the tree depth-wise or level-wise rather than leaf-wise. So when growing on the same leaf in LightGBM, the leaf-wise algorithm can reduce more loss than the level-wise algorithm and hence results in much

better accuracy which can rarely be achieved by any of the existing boosting algorithms. It uses a novel technique ofGradient-based One-Side Sampling (GOSS)to filter out the data instances for finding a split value while XGBoost uses a pre-sorted & Histogram-based algorithm for computing the best split.

All regression models have achieved considerably lowMAE values on this dataset which shows their acceptable capability.

Table 5.9: Results of injury severity for the left elbow.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.6169 0.6735 0.0256 0.418 +/- 0.121 Random Forest 0.6113 0.644 0.0397 0.697 +/- 0.051 Gradient Boosting 0.6098 0.6555 0.0594 0.638 +/- 0.003

LGBM 0.6225 0.6017 0.0923 0.694 +/- 0.08

CatBoost 0.6102 0.7804 0.0629 0.706 +/- 0.086 XGBRegressor 0.6067 0.6465 0.0659 0.621 +/- 0.053

Table5.9shows the results of body injuries for the left elbow. Since the number of worker with body injuries in the left elbow area is small in the data set. Data for this category has been scarce, so simpler models will work better on this data.

According to theRMSLEcriterion, the XGBRegressor model performed better than the other 5 models. As XGBRegressor works well in small to medium datasets, It is designed to handle missing data with its in-build features. XGBoost uses DTs as base learners;

combining many weak learners to make a strong learner. As a result, it is referred to as an ensemble learning method since it uses the output of many models in the final prediction.

According to theR2criterion, theDTmodel performed better than the other 5 models.

It is one of the quickest ways to identify relationships between variables and the most significant variable.DTs are not largely influenced by outliers or missing values, and they can handle both numerical and categorical variables since it is a non-parametric method, it has no assumptions about space distributions and classifier structure.

CatBoost has the highest value ofR2 on the validation data because its structure reduces overfitting and improves the generalization of the model.

AlthoughDTandRFmodels could achieve the bestMAEerror, they could not become the superior models in comparison with other models.

Table 5.10: Results of injury severity for the right elbow.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.7056 0.734 0.0562 0.685 +/- 0.077 Random Forest 0.6861 0.8454 0.1092 0.815 +/- 0.094 Gradient Boosting 0.6888 0.8268 0.1158 0.784 +/- 0.076

LGBM 0.6817 0.7868 0.13 0.789 +/- 0.062

CatBoost 0.6611 0.8223 0.1413 0.834 +/- 0.07 XGBRegressor 0.6865 0.8318 0.1227 0.793 +/- 0.094

Similarly, Table5.10shows the results of body injuries for the right elbow. According to theRMSLEcriterion, the CatBoost performed better than the other five models. It also has the highest value ofR2on the validation data.

The primitive learning approach of the first two models seems to make them able to get the lowerMAEresults, although it couldn’t help them to achieve the best overall results among all models.

According to the R2criterion, the RF performed better than the other five models.

The random forest algorithm provides a higher level of accuracy in predicting outcomes than the decision tree algorithm because, in aRFregression, each tree produces a specific prediction. The mean prediction of the individual trees is the output of the regression.

Table 5.11: Results of injury severity for the left wrist.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.8114 0.2598 0.0395 0.592 +/- 0.099 Random Forest 0.7561 0.7972 0.0807 0.784 +/- 0.069 Gradient Boosting 0.7534 0.7063 0.1155 0.79 +/- 0.047

LGBM 0.7749 0.8044 0.112 0.796 +/- 0.042

CatBoost 0.73 0.8231 0.0915 0.811 +/- 0.061 XGBRegressor 0.7471 0.7621 0.1179 0.811 +/- 0.055

Table5.11shows the results of body injuries for the left wrist. The CatBoost model performed better than the other five models in terms of RMSLE, R2 and CV-R2 and, achieving the second bestMAEvalue. The dataset has a significant percentage of bodily injury data in the left wrist area, so more complicated models will perform better on more data, which is why the Catboot model is better than the other 5 models in this area.

Table 5.12: Results of injury severity for the right wrist.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.7599 0.4498 0.0392 0.112 +/- 0.237 Random Forest 0.7648 0.5525 0.1286 0.488 +/- 0.136 Gradient Boosting 0.8 0.6525 0.1337 0.494 +/- 0.171

LGBM 0.686 0.8188 0.1199 0.569 +/- 0.143

CatBoost 0.7117 0.6549 0.1319 0.624 +/- 0.109 XGBRegressor 0.7853 0.4054 0.1355 0.523 +/- 0.159

Table5.12shows the results of body injuries to the right wrist. TheLightGBM per-formed better than the other five models in terms ofRMSLEandR2although it did not achieve the best MAE. LightGBMis more accurate than any other boosting algorithm because it uses a leaf-wise split technique rather than a level-wise split strategy to gen-erate considerably more complex trees, which is the major factor in achieving higher accuracy. On the validation data, CatBoost has the highestR2value. Because its boosting techniques aid in reducing over-fitting and improving model quality.

Table 5.13: Results of injury severity for the left fingers.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.5967 0.4559 0.0112 0.541 +/- 0.368 Random Forest 0.5774 0.7721 0.0383 0.798 +/- 0.096 Gradient Boosting 0.5948 0.769 0.0243 0.69 +/- 0.152

LGBM 0.6058 0.7298 0.0442 0.718 +/- 0.052

CatBoost 0.5621 0.7346 0.0263 0.745 +/- 0.026 XGBRegressor 0.5823 0.7973 0.0275 0.769 +/- 0.057

Considering the left finger injuries (Table5.13), models behave differently.

The CatBoost model performed better than the other five models in terms ofRMSLE.

It increases the accuracy by reducing overfitting and making more efficient predictions.

As a result, it will have fewer errors and be more accurate than other models.

The XGBRegressor performed better than the other five models in terms ofR2. As XGBRegressor is designed to handle missing data with its in-built capabilities, it works well on small and medium-sized datasets.

The simplicity of the Decision Tree model make it suitable for achieving a betterMAE although it failed to be the best model in aspect of theR2score.

TheRFperformed better than the other five models in terms ofCV-R2. Because each tree in aRFregression gives a specific prediction, theRFalgorithm provides a better level of accuracy in predicting outcomes than theDTapproach. The regression’s output is the average prediction of the individual trees.

Table 5.14: Results of injury severity for the right fingers.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.6014 0.7184 0.0266 0.558 +/- 0.132 Random Forest 0.6004 0.8189 0.0565 0.726 +/- 0.084 Gradient Boosting 0.62 0.6815 0.0389 0.728 +/- 0.081

LGBM 0.6109 0.7012 0.0815 0.684 +/- 0.093

CatBoost 0.5825 0.8531 0.0532 0.734 +/- 0.101 XGBRegressor 0.6091 0.7609 0.0464 0.754 +/- 0.083

Similarly for right fingers body injuries result in Table5.14, the CatBoost performed better than the other five models in terms of RMSLEand R2. The XGBRegressor per-formed better than the other five models in terms ofCV-R2.

Table5.15shows the results of left knee injuries. The CatBoost performed better than the other five models in terms of RMSLE,MAE andCV-R2. TheRFperformed better than the other five models in terms ofR2.

Regarding right knee injuries, which are shown in Table5.16, the CatBoost performed better than the other five models in terms of RMSLE. The Random Forest performed better than the other five models in terms ofR2andCV-R2.

Table 5.15: Results of injury severity for the left knee.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.4791 0.3067 0.0251 0.347 +/- 0.441 Random Forest 0.4946 0.7532 0.0242 0.437 +/- 0.207 Gradient Boosting 0.5207 0.639 0.0277 0.226 +/- 0.112

LGBM 0.5232 0.5662 0.0308 0.302 +/- 0.177

CatBoost 0.4738 0.6775 0.0228 0.45 +/- 0.099 XGBRegressor 0.5159 0.6158 0.0253 0.384 +/- 0.201

Table 5.16: Results of injury severity for the right knee.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.4585 0.8192 0.0208 0.592 +/- 0.334 Random Forest 0.4571 0.7773 0.0408 0.739 +/- 0.163 Gradient Boosting 0.4539 0.6701 0.0483 0.673 +/- 0.246

LGBM 0.4639 0.5325 0.0743 0.545 +/- 0.349

CatBoost 0.4383 0.7069 0.0395 0.721 +/- 0.12 XGBRegressor 0.4517 0.5281 0.0539 0.699 +/- 0.22

Table 5.17: Results of injury severity for the left foot.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.7371 0.4458 0.0185 0.552 +/- 0.152 Random Forest 0.7201 0.648 0.0099 0.764 +/- 0.066 Gradient Boosting 0.5928 0.0051 0.0307 0.727 +/- 0.062

LGBM 0.7248 0.5984 0.0151 0.727 +/- 0.062

CatBoost 0.6965 0.6773 0.0126 0.764 +/- 0.039 XGBRegressor 0.711 0.6629 0.001 0.738 +/- 0.041

With regard to left foot injury severity prediction (Table5.17), the CatBoost performed better than the other five models in terms ofRMSLE,R2, andCV-R2and it was also able to get considerably lowMAEwhich makes it the best model overall.

Table 5.18: Results of injury severity for the right foot.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.7481 0.342 0.0014 0.497 +/- 0.238 Random Forest 0.7302 0.616 0.0099 0.756 +/- 0.062 Gradient Boosting 0.7265 0.6087 0.0058 0.74 +/- 0.034

LGBM 0.7248 0.5984 0.0151 0.727 +/- 0.062

CatBoost 0.6965 0.6773 0.0126 0.764 +/- 0.039 XGBRegressor 0.711 0.6629 0.001 0.738 +/- 0.041

Table5.18shows the results of right foot Injuries in which the CatBoost performed better than the other five models in terms ofRMSLE,R2, andCV-R2. However, simpler

models like XGBRegressor andDTcould get the bestMAEresults.

Table 5.19: Results of injury severity for the trunk.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.9132 0.739 0.2602 0.844 +/- 0.033 Random Forest 0.8948 0.8288 0.2925 0.89 +/- 0.013 Gradient Boosting 0.8951 0.8306 0.307 0.853 +/- 0.031

LGBM 0.8901 0.8684 0.1622 0.893 +/- 0.015

CatBoost 0.8851 0.8443 0.3352 0.896 +/- 0.017 XGBRegressor 0.9129 0.8416 0.3255 0.866 +/- 0.03

The CatBoost performed better than the other five models in terms ofRMSLE,CV-R2 on trunk injuries dataset (Table 5.19). TheLightGBM performed better than the other five models in terms ofR2andMAE. In all models,R2on validation data is better than R2on training data, and it shows that generalization of models performs well.

Table 5.20: Results of injury severity for the neck.

Model RMSLE R2 MAE CV-R2

Decision Tree 0.5351 0.6776 0.0598 0.813 +/- 0.136 Random Forest 0.5348 0.6875 0.1108 0.886 +/- 0.03 Gradient Boosting 0.5393 0.6962 0.108 0.901 +/- 0.035

LGBM 0.5493 0.4906 0.1216 0.672 +/- 0.077

CatBoost 0.536 0.5905 0.1086 0.851 +/- 0.026 XGBRegressor 0.5379 0.6752 0.1051 0.887 +/- 0.041

Talking about neck injury results regarding the Table5.20, theGBperformed better than the other five models in terms of R2, CV-R2. GB can optimize for different loss functions and provides several hyperparameter tuning options that make the function fit very flexible. TheDTmodel, as a simpler variant ofGB, was able to achieve similar results as itsMAEis the lowest compared to others.

Table 5.21: Results of vibration using forbiddance.

Model RMSLE R2 MAE CV-R2

Decision Tree 1.189 0.5933 0.2245 0.72 +/- 0.087 Random Forest 1.151 0.7414 0.2365 0.793 +/- 0.044 Gradient Boosting 1.1653 0.5783 0.2455 0.764 +/- 0.034

LGBM 1.1766 0.7544 0.2316 0.802 +/- 0.027

CatBoost 1.1217 0.7744 0.2022 0.803 +/- 0.029 XGBRegressor 1.1412 0.7247 0.2211 0.75 +/- 0.056

Table5.21 shows the regression model result for predicting the forbiddance proba-bility of working with tools that have a vibration in their working mechanism. As the presented result shows, although theRFmodel was able to achieve the betterRMSLE, its

R2-score is not quite promising. On the other hand, the CatBoost model seems to have the best possible overall performance considering all possible metrics.

Table 5.22: Results of weight lifting forbiddance.

Model RMSLE MAE R2 CV-R2

Decision Tree 0.8051 0.280 0.4262 0.465 +/- 0.129 Random Forest 0.5665 0.2727 0.6897 0.674 +/- 0.026 Gradient Boosting 0.6513 0.2721 0.7094 0.654 +/- 0.046 LGBM 0.5292 0.2916 0.7017 0.676 +/- 0.025 CatBoost 0.6693 0.2499 0.7488 0.724 +/- 0.04 XGBRegressor 0.6549 0.2789 0.7077 0.681 +/- 0.038

In Table5.22, we have predicted the forbiddance probability of lifting heavy weights by workers based on their medical records. The performance trend is again like the previous table in the way that the CatBoost regression model has the best performance by achieving betterMAEandR2in comparison with other models, as its structure is more complex, which allows it to fit better on studied data, while theLightGBMmodel has the bestRMSLE.

In general, we determine the best overall regression model for each of the different body parts based on the considered metrics in the table5.23.

Table 5.23: Best overall model per each body part.

Body parts Best model Left shoulder CatBoost Right shoulder CatBoost

Left elbow CatBoost

Right elbow CatBoost

Left wrist CatBoost

Right wrist LGBM

Left fingers XGBRegressor Right fingers CatBoost

Left knee CatBoost

Right knee Decision Tree

Left foot CatBoost

Right foot CatBoost

Trunk CatBoost

Neck Gradient Boosting

According to the trunk results, the precision value had an upward trend, starting at 0.1622 with theLightGBMalgorithm and steadily rising to 0.2602 for theDTand 0.2925 for theRFapproach. In the last stage of the experiments, the CatBoost algorithm reached a peak of 0.3352. Clearly, theR2enhanced the precision value of otherDTalgorithms by 91% (XGBoost), 90% (DT) and 89% (RF).

The highest rates among the three categories were attained by the second injured body

part, which is the right shoulder. TheR2percentage were attained by the CatBoost (0.72) and LightGBM (0.71), while other approaches had lowerR2 values. The RMSLE test results did not show any significant differences in the case of the error metrics, and even so, the contribution of theMAEmethod is clear, with an improvement over the Catboost technique of 7% and the XGboost method by 1%.

R2is considered an overall metric that is included in the prediction of the next medical appointment. The results show the steady rise ofR2values during the 126 weeks. The low-est value belongs to theDTalgorithm (0.61) while the peak of 0.85 corresponds to when the CatBoost algorithm was applied. In other words, the 1.22 error ofRMSLECatboost regression algorithm significantly contributed to theRMSLEvalues of the XGBRegressor (1.23), Gradient Boosting (1.26), and Random Forest (1.27) algorithms, respectively.

To conclude, 126 weeks of evaluations (4.7) based on the three most familiar metrics, namely R square,MAEandRMSLE, demonstrates that the proposed regression algorithm in this research, CatBoost successfully enhanced the prediction accuracy compared to the other studied and implemented recommender approaches. Pertaining to the precentaion MSDsproblem, the investigation results based on collected relevance feedback from OH-PPs indicate that the CatBoost algorithm may significantly contribute to resolving this issue. In addition to addressing new workers with a history of medical visits, the Cat-Boost approach could improve the average value of regression for new workers by 5%

compared to the traditional algorithm.

5.3.1.1 First scenario learning curves

For analyzing the different designed models’ learning process more visually, we consid-ered 80% of the studied data for training them and the rest 20% as the test data for their learning trend comparison.

Learning curves compare the optimal value of a model’s loss function for a training set to the same loss function assessed on a validation data set with the same parameters.

In this section, we use such plots to determine how much extra training data benefits a learning model and whether the estimation is affected by potential overfitting.

Figure5.2(a) shows the learning curve of models for the right elbow injury prediction.

It indicates that when the data size reaches a certain level that is sufficient for the model learning process and prevents the occurance of overfitting, the error rate of most models such as CatBoost, XGbooslreaningt,RF, andGBdecreases with increasing data size. How-ever, theLightGBMandDTseem to perform less suitable fitting on the data to achieve the best possible result.

The estimation of possible left-hand finger injuries for the models is shown in the Figure5.2(b). As the figure illustrates , although models could finally achieve good per-formance at the maximum size, increasing the training set size doesn’t seem to show a

100 200 300 400 500 600

Training set size

0.20 0.25 0.30 0.35

MAE

DecisionTree RandomForest GBoosting LGBMCatBoost XGboost

(a)

100 200 300 400 500 600

Training set size

0.06 0.08 0.10 0.12 0.14 0.16 0.18

MAE

DecisionTree RandomForest GBoosting LGBMCatBoost XGboost

(b)

Figure 5.2: Learning curves of models for (a) right elbow and (b) left fingers.

clear correlation with the final model error, and in both cases, the CatBoost model perfor-mance is remarkably good. Other models have shown similar downward perforperfor-mance, except forLightGBM, which can indicate that this model is more of a fast computational model than a comprehensive one.

5.3.1.2 Train Data Body Parts Injury

A pie chart shows the relationships of parts to the whole variables. Pie charts can be helpful for showing the relationship of parts to the whole when there are a small number of levels. In this section, we examined the percentage of injuries in different areas of the body in different categories using pie charts.

shoulder_L, 13%

shoulder_R, 16%

elbow_L, 6%

elbow_R, 8%

fingers_L, 4%

fingers_R, 5%

knee_L, 2%

knee_R, 4%

foot_L, 4%

foot_R, 4%

wrist_L, 7%

wrist_R, 7%

trunk, 19%

neck, 4%

(a)

shoulder_L, 12%

shoulder_R, 15%

elbow_L, 6%

elbow_R, 8%

fingers_L, 4%

fingers_R, 4%

knee_L, 2%

knee_R, 4%

foot_L, 3%

foot_R, 3%

wrist_L, 7%

wrist_R, 7%

trunk, 19%

neck, 5%

(b)

Figure 5.3: Average body injuries in men for (a) train data and (b) test data.

Figure5.3(a) shows a pie chart related to body injuries in men. The pie chart on train data shows that men have the most damage in the first level from the trunk with 19%

and in the second place from the right shoulder with 16%. The shoulder is actually a collection of a number of joints, tendons, and limbs that provide a wide range of motion

for the hand to perform a variety of movements. Of course, this high mobility also comes at a cost, such as joint instability or soft tissue damage. Shoulder pain can also be caused by high activity, which is more common in men. After the left shoulder, with 13% in the third level, the rest of the body has a small percentage of damage.

The same plot for test data in Figure5.3(b) shows that men have the most damage in the first level from the trunk, with 19% and in the second place from the right shoulder, with 15%. Similar to train data, the highest percentage of injuries in test data is from the trunk. The shoulders are on the second level, and the rest of the body has a small percentage of the injury.

shoulder_L, 13%

shoulder_R, 16%

elbow_L, 8%

elbow_R, 10%

fingers_L, 3%

fingers_R, 3%

knee_L, 2%

knee_R, 3%

foot_L, 4%

foot_R, 4%

wrist_L, 15%

wrist_R, 13%

trunk, 10%

neck, 4%

(a)

shoulder_L, 12%

shoulder_R, 15%

elbow_L, 9%

elbow_R, 9%

fingers_L, 3%

fingers_R, 2%

knee_L, 3%

knee_R, 3%

foot_L, 4%

foot_R, 4%

wrist_L, 13%

wrist_R, 12%

trunk, 11%

neck, 4%

(b)

Figure 5.4: Average body injuries in women for (a) train data and (b) test data.

In Figure5.4(a) relating to body injuries in women on train data, we can observe that women have the most damage in the first level from the right shoulder (16%) and in the second level from the left wrist (15%). After the left shoulder and right wrist with 13%

in the third level, the rest of the body has a small percentage of damage. And the reason could be that women do most of the housework, and this type of work causes pain in the shoulders and wrists.

So, in general, it can be said that both men and women have body injuries in the area of the right shoulder with an equal percentage of 16. But men have the most body injuries from the trunk, with 19%.

The similar pie chart on test data shows that women have the most injuries in the first level from the right shoulder with 15% and in the second level from the left wrist with 13% and after the left shoulder and right wrist with 12% in the third level, the rest of the body has a small percentage of injuries.

If the results are also analyzed based on the work history of the worker, a very good view of the injuries can be obtained in relation to the work experience of the worker.

In other words, it becomes clear that more work affects which area of the body, and the consequences can be known.

Figure5.5shows the body injuries in worker with a work experience of 1–10, and it

shoulder_L, 8%

shoulder_R, 14%

elbow_L, 6%

elbow_R, 5%

fingers_L, 3%

fingers_R, 3%

knee_L, 2%

knee_R, 4%

foot_L, 3%

foot_R, 3%

wrist_L, 13%

wrist_R, 11%

trunk, 12%

neck, 3%

(a)

shoulder_L, 8%

shoulder_R, 13%

elbow_L, 6%

elbow_R, 4%

fingers_L, 3%

fingers_R, 3%

knee_L, 3%

knee_R, 5%

foot_L, 3%

foot_R, 3%

wrist_L, 10%

wrist_R, 10%

trunk, 14%

neck, 2%

(b)

Figure 5.5: Average body injuries in seniority of 1 - 10 for (a) train data and (b) test data.

has been determined which area has suffered the most in these worker for both training and test data. According to the train data analysis, the highest percentage of bodily injuries among these worker in the first level was the right shoulder, with 14% and in the second level, the left wrist, with 13% and finally the trunk, with 12%. The rest of the body has a small percentage of damage.

The (b) sub-figure determines which area has suffered the most in these worker on test data. According to the analysis, the highest percentage of bodily injuries in these worker in the first level was the trunk, with 14% and in the second level, the right shoulder, with 13% and wrists, with 10% in the third level. The rest of the body has a small percentage of damage. In the train data, the highest percentage was related to the right shoulder, but in the test data, the highest percentage was related to the trunk.

shoulder_L, 16%

shoulder_R, 18%

elbow_L, 6%

elbow_R, 8%

fingers_L, 5%

fingers_R, 7%

knee_L, 2%

knee_R, 2%

foot_L, 5%

foot_R, 5%

wrist_L, 12%

wrist_R, 7%

trunk, 13%

neck, 2%

(a)

shoulder_L, 14%

shoulder_R, 17%

elbow_L, 5%

elbow_R, 8%

fingers_L, 6%

fingers_R, 6%

knee_L, 1%

knee_R, 1%

foot_L, 3%

foot_R, 3%

wrist_L, 12%

wrist_R, 7%

trunk, 13%

neck, 2%

(b)

Figure 5.6: Average body injuries in seniority of 10 - 20 for (a) train data and (b) test data.

The Figure5.6illustrates the body injuries in worker with a work experience of 10-20 and it has been determined which area has suffered the most in these worker. According to the train data analysis, the highest percentage of physical injuries among these worker in

No documento Diagnosis and Prognosis of Occupational disorders based on Machine Learn- ing Techniques applied to Occupational Profiles (páginas 92-126)