FUZZY CLUSTERING : APPLICATION ON ORGANIZATIONAL METAPHORS IN BRAZILIAN COMPANIES

Different theories of organization and management are based on implicit images or metaphors. Nevertheless, a quantitative approach is needed to minimize human subjectivity or bias on metaphors studies. Hence, this paper analyzed the presence of metaphors and clustered them using fuzzy data mining techniques in a sample of 61 Brazilian companies that operate in the state of Rio Grande do Sul. For this purpose the results of a questionnaire answered by 198 employees of companies in the sample were analyzed by R free software. The results show that it is difficult to find a clear image in most organizations. In most cases characteristics of different images or metaphors are observed, so soft computing techniques are particularly appropriate for this type of analysis. However, according to these results, it is noted that the most present image in the organizations studied is that of “organisms” and the least present image is that of a “political system” and of an “instrument of domination”.


INTRODUCTION
Images are metaphors that permit an interpretation of what is happening to the organizational culture (Morgan, 1986).It allows to better understand the behavior of employees in order to adhere or not to organizational compliance, as stated in the study performed by Spears and Barki (2010) which analyzed their awareness on information security.In Bulgurcu, Cavusoglu and Benbasat (2010) employees` behavior on compliance matters were also analyzed, in this case involving rationality-based beliefs and information security awareness.
The study of organizational images is widely used in management courses because they provide a very detailed view from organizational studies.However, when it is narrowed to a sector or regional study, this kind of approach becomes limited.To overcome this limitation this study presents an approach using Data Mining and Soft Clustering techniques to understand what can happen in an organizational culture environment through images in a large number of companies.
The developed application reached 61 enterprises but using this method it can be extended to a significant number of companies, coming to a complete study applied to a whole country.The sections in this paper are divided into a theoretical basis, methodological aspects, a case study and finally discussions on the results.Morgan (1986) believes that one can better understand organizations by recognizing metaphors that prompt one to view organizations through a certain angle.Metaphors play a paradoxical role: they are vital to understand and highlight certain aspects of organizations, while they restrict understanding by back grounding or ignoring others.Morgan illustrates his ideas by exploring eight archetypical metaphors of organization: Machines, Organisms, Brains, Cultures, Political Systems, Psychic Prisons, Flux and Transformation, Instruments of Domination.In this work and based on (Knorst, Vanti, Andrade, & Johann, 2011), brains and culture metaphors are considered as a single image, so 7 metaphors or images are analyzed.

Mechanistic (M):
Organizations that impose rigid routines and patterns, hierarchically distributed.Dealings are impersonal and control of the organization is bureaucratic.Because it is very predictable, it is no longer regarded as ideal, even in stable and authoritarian institutions.This style also presents difficulties for innovation.

Psychic Prisons (PP):
Inflexibility is a characteristic of this image, becoming a prisoner of past events, allied to fundamental attitudes by their idealizers.Some of their traps are false assumptions, rules without questioning and fanaticism around the charisma of the leader.

Political Systems (PS):
This view is not often in the interest of the group and often favors authoritarian executives.This includes companies with participatory management that is encompassed in political systems because although there is a certain distribution of power, the central objective will be executed by both subordinates and the owners of the capital.

Instruments of Domination (ID):
In organizations viewed as instruments of domination, the employees and managers need to completely dedicate themselves to the company.They feel insecure about their employment and experience a lot stress on the job.

Organisms (O):
The fundamental principal of organisms is that it is based on the employees' intellectual capital.Motivation is a substantial factor.Because of constant innovation and deadlines, employees tend to obey a biological clock because there are targets to reach and constantly innovations to develop.
Brain/Cybernetic (C): Intellectual capital is highly valued and is constantly being stimulated to improve.Decision-making needs to be done "through formal or temporary processes, producing policies and plans that offer a point of reference or a structure for information processing" (Johann, 2008, p. 33).The definition of cybernetic is given due to the fact that information technology is permanently present, which ensures better conditions in the review of political norms and procedures, in addition to learning how to absorb changes in the environment.

Flux and Transformation (FT):
Organizations that best mirror flux and transformation are those that modify and evolve to conform to change and evolution in the environment.Their survival depends on their internal and external environments These images represent the employees` behavior as Morgan (1986) stated.Analysis can be performed properly with this approach considering organizational case studies but has some limitations in sector and regional studies using the same criteria.To balance it this paper proposes the application of Data Mining (DM) and Soft Clustering techniques that are presented below.Organizational identity can also reside in metaphorical images internalized in the members of the organization (Taber, 2007).Each employee`s perception about the image projected by the organization can even be quite different, so that the use of a fuzzy approach is particularly appropriate.

DATA MINING AND SOFT CLUSTERING
Simply stated, data mining refers to extracting or "mining" knowledge from large amounts of data (Han & Kamber, 2006).This area has attracted a great deal of attention in the information industry and in society as a whole in recent years and data mining techniques have been applied to a wide variety of areas.Data mining techniques have been used for trying to predict behavioral patterns, generate forecasts, identify trends or changes thereto, as well as to discover relationships between information pieces in order to optimize decision making.Thus, there is no doubt as to the practical application in those processes where a large amount of data must be handled.This explains, therefore, why this area of knowledge has drawn the attention of different sectors of the information industry in recent years.
Cluster analysis or clustering is a main task of explorative data mining, and a common technique for statistical data analysis used in many fields (Kaufman & Rousseeuw, 2008).Data clustering is the process of dividing data elements into classes or clusters so that items in the same class are as similar as possible, and items in different classes are as dissimilar as possible (Witten & Frank, 2005).The potential of clustering algorithms is to reveal the underlying structures in data and it can be exploited in a wide variety of applications, including classification, image processing and pattern recognition, modeling and identification.In particular, data mining techniques can be used to identify categories or behavioral patterns in organizations.
Many clustering algorithms have been introduced in the literature (Pedrycz, 2005).A widespread accepted classification scheme subdivides these techniques into two main groups: hard (crisp) or soft (fuzzy) clustering.In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster, however in fuzzy clustering, data elements can belong to more than one cluster, and associated with each element is a set of membership levels that indicate the strength of the association between that data element and a particular cluster.Due to the fuzzy nature of many practical problems, a number of fuzzy clustering methods have been developed following the general fuzzy set theory strategies outlined by (Zadeh, 1965).Fuzzy set theory deals with the representation of classes whose boundaries are not well defined.The key idea is to associate a membership function that takes values in the interval [0,1], with 0 corresponding to non membership in the class and 1 corresponding to full membership.Thus, membership is a notion intrinsically gradual instead of abrupt as in conventional Boolean logic.
The concept of fuzzy partition is essential for cluster analysis and identification techniques that are based on fuzzy clustering.The most known method of fuzzy clustering is the Fuzzy c-Means method (FCM), initially proposed by Dunn (1973) and generalized by Bezdek (1981) and other authors; in Kruse, Hoppner, Klawonn and Runkler (1999) an overview is presented.The FCM is based on an optimization problem which objective function is defined as: where {x 1 ,x 2 ,…,x n } is the input sample set, that is, the objects that have to be clustered, c is the number of clusters, {c 1 ,c 2 ,…,c n } the centroids of the clusters, which can be defined by a given matrix or randomly chosen, and u ij is the degree of membership of x i in the cluster j.Finally, the parameter m is a real number greater than 1 that is a weighting factor called fuzzifier.Normally the Euclidean distance is used, but any norm ||*|| expressing the dissimilarity between any measured data and the center can be used.One of the drawbacks of FCM is the requirement for the number of clusters, c, to be specified before the algorithm is applied.In the literature, methods for selecting the number of clusters for the algorithm can be found (Pham, Dimov, & Nguyen, 2005).
Fuzzy partitioning is carried out through an iterative minimization of the objective function under the following fuzzy constraints: ∑ In the approach proposed by Bezdek (1981) in each iteration membership levels u ij and centroid positions c j are updated applying the technique of Lagrange multipliers.The algorithm stops when a maximum number of iterations is reached, or when the algorithm is unable to reduce the current value of the objective function.Fuzzy Clustering: Application on Organizational Metaphors in Brazilian Companies Given the fact that different organizational images can often be linked to an organization, in this work a soft clustering approach is considered more appropriate.Using the FCM technique, each organization is allowed to belong to many clusters with different degrees of membership and therefore they have multiple images or metaphors linked.In the paper the results of the analysis are presented.
Any data mining process is composed of the following basic phases or stages: data compilation; data processing (in which it is cleaned, transformed and reduced); application of data mining (determining the model to use, carrying out statistical analysis, and graphically visualizing data to obtain a first approximation); and finally, interpretation and evaluation of results obtained.Following the previous stages, in the next sections we will show the practical application of data mining techniques to identify behavioral features in a sample of Brazilian companies.

DATA COMPILATION AND PROCESSING: INSTRUMENT FOR THE IDENTIFICATION OF ORGANIZATIONAL IMAGES IN A SAMPLE OF BRAZILIAN COMPANIES
For the identification of images, an instrument developed by Johann (2004) was used.This instrument is a questionnaire with 35 questions on organizational aspects that are grouped into 7 blocks; each block is associated with one of the images considered.In order to identify characteristics of the images in an organization, a set of employees can make a quantitative assessment on each of the 35 questions of the questionnaire.The evaluation uses a discrete scale with values between 1 and 4, according to the following criteria: 4 if there is a strong presence, 3 if there is a reasonable presence, 2 if there is little impact and 1 if there is virtually no presence.
The Appendix shows the 35 questions selected and Table 1 shows the relationship of each question with one of the 7 images defined by Morgan.With the answers to 35 questions, 7 numerical values can be generated with the sum of the scores for each of the 5 questions related to each of the 7 images.These 7 values can be taken into account in determining the most relevant image in the company, according to the opinions of the employee interviewed.An example of the tabulation of answers to the questionnaire is shown in Table 2.The sums of the scores associated with each of the images are shown in the last row, for example, in the case presented in Table 2, the most visible organizational images are those of the "political system (SP)", but images M, C and ID also obtain high scores.

Company name: COMPANY_1
Sector To analyze the organizational images with greater presence in the state of Rio Grande do Sul (Brazil), a sample of 61 companies from various sectors and sizes was selected.In each company a group of up to 4 employees were interviewed, resulting in a total of 198 responses to the questionnaire (mean of 3.25 responses per company).All data were pre-processed for analysis with data mining techniques.

APPLICATION OF DATA MINING: FUZZY CLUSTERS IDENTIFICATION
Clustering algorithms were applied to try to identify groups of companies responding, according to their employees, to similar images.We used a free software environment for statistical computing and graphics; this software is R and can be downloaded from the following site http://www.r-project.org/.This software implements a great variety of clustering algorithms; the Fuzzy C-Means (FCM) algorithm, implemented in package 'e1071', was selected.The cmeans command needs several parameters to run:  The data matrix where columns correspond to variables and rows to observations.In our case 7 variables were considered with the average values corresponding to the sum of the scores of the 5 questions from each of the 7 blocks given by each employee in the company.The data matrix has 61 rows (companies). Number of clusters or initial values for cluster centers.In our case we decided to give 7 initial cluster centers.The center of cluster i was initially defined as: Note that 5 is the minimum value and 20 the maximum in a block of 5 questions.
 Maximum number of iterations, the value 500 was used.
 Distance measure to use; we used the "euclidean" distance.
The algorithm needed a total number of 218 iterations to converge, and the final error was 3,8954.After the execution of the 218 iterations, the cluster centers were updated as shown in Table 3. Bold identifies the highest values in each centroid, that is, the images obtained higher scores in each group.
Table 4 shows a ranking of the images with the greatest presence in each group.As can be seen, image O is clearly the most relevant in most groups.Another image with a high presence in the groups is FT.In relation to the images of smaller presence in the sample, they are those corresponding to PP and SP.Image PP More relevant images in the cluster 1 14.06969 14.14340 10.75342 12.25663 12.81599 13.73587 10.68910 O, M, FT 2 13.43621 14.91055 11.87856 13.15049 12.30009 13.90021 10.71262 O, FT, M 3 13.87489 13.30586 14.32899 12.72550 14.97622 14.11905 13.77686 ID, SP, FT 4 14.89808 16.73978 12.87971 15.28379 14.23304 16.49389 12.38194 O, FT, C 5 14.21077 14.71124 13.08771 13.61664 14.26382 14.52736 12.50887 O, FT, ID 6 15.03585 15.98669 12.03754 14.72027 13.31240 14.95138 11.05963 O, M, FT 7 14.31467 14.60668 13.31190 13.63852 14.39336 14.56143 12.62568 O, FT, ID Table 3. Cluster centers and more relevant images after the execution of the FCM algorithm.
As a result of the algorithm we obtained a matrix with the degrees of membership of each company for each of the 7 groups identified.Table 5 shows this matrix; the membership levels can also be shown using a density plot, as in Figure 1.A graphical representation of relationships between variables in the clusters is also shown in Figure 2. Figure 1 shows the companies in the vertical axis and membership levels on the horizontal axis.The darker shades in that graph correspond to higher membership values.As shown, in some companies there is a clear association with one of the groups, but in most cases the association with a single group is not as clear.The same conclusion can be reached by observing Figure 2, which shows the pairwise relationships between the variables used for performing the process of clustering.

EVALUATION OF RESULTS
As shown in Table 3, after executing the algorithm the centroids of each group are not clearly related to a single image.Instead, each group has values assigned to each feature (image) which are very different from those initially chosen.Although in most cases the image initially linked to the centroid is among those most present in the final centroid, in one case, that corresponding to group 7, the initial image (PP) does not have a strong presence in the final centroid.In fact, its value is the lowest value obtained in the centroid of the group.This seems to confirm that this image does not have a strong presence in the sample analyzed.In addition, these final centroids show that most of the companies seem to fit a mixed image, with a combination of characteristics from different images or organizational metaphors.
The membership levels allow us to analyze the presence of organizational images for the companies in the sample.For example, the FCM algorithm has assigned the following levels of group membership to company 1: As can be seen, in this case no single cluster can be clearly linked to this company.Instead, there are four clusters with similar degrees of membership and quite different from the rest; specifically clusters 4, 5, 6 and 7.According to the centroids of these groups (see Table 3), the most relevant organizational images in these clusters are O and FT.Specifically, if an image k is considered, the membership levels (u ij ) and the final centroids (c ij ) could be used to obtain a quantitative assessment eval(k,p) of the presence of the image k in the company p, using the following expression: In the case of company 1, the maximum values for eval(k,1) are reached in k=2 (associated to image O) and k=6 (associated to image FT), with values 15.14 and 14.86, respectively.
In some cases, the FCM algorithm is able to allocate a cluster for a particular company more clearly.For example, for company 34, the membership levels obtained by the algorithm are: u 34 = (0.0181, 0.0190, 0.7434, 0.0188, 0.0831, 0.0184, 0.0993) In this case, we can see a clear link between the company and group 3. Figure 3 shows the difference with the previous case.If the evaluation of function eval() is carried out, the most relevant images in company 34 are ID and FT.These results confirm that this company seems to also have characteristics common in different images.

Figure 3. Image levels for companies 1 and 34 in the sample.
The R cmeans command also generates the closest hard clustering solution.This information is also useful for identifying significant groups.Table 6 shows the number of companies in the 7 hard clusters after the execution of the algorithm.

Company 34
As Table 6 shows, three groups have more frequency than the rest; they are clusters 1, 3 and 4. Cluster 6 is the one with the least number of companies.The next section will discuss the results reached and possible lines for future work.

DISCUSSIONS
In conclusion, this paper analyzed the potential of data mining techniques to extract knowledge about organizational aspects in a sample of Brazilian companies.In addition, soft computing has shown itself to be a very suitable tool for identifying organizational patterns, where the difference between some patterns and others is not so clear.
The results of applying the techniques of soft clustering confirm the difficulty associating a single image or metaphor to a company, as features of the other images are also present.However, according to these results, it is noted that the most visible image in the organizations studied is that of "organisms".This image is the most relevant in 6 out of the 7 groups identified.This metaphor means seeing the businesses as behaving in similar ways to our own biological mechanisms; successful businesses are often adaptable and open to change and the structures and procedures are less rigid.Central to this metaphor is the theory of open systems which are "open" to their environment and have to achieve appropriate relationships with their environment in order to survive.Also, it is remarkable the fact that the "flux and transformation" image appears in all groups with a high value.So, characteristics as constant change, dynamic equilibrium, flow, self-organization, systemic wisdom, attractors, chaos, complexity, butterfly effect, emergent properties, dialectics, and paradox are also present in most of companies.
All groups obtained seem to have a very similar structure, with the most similar and relevant images, but one of them (cluster 3) is clearly different from the rest, with a high degree of association with images that are less present in the other groups, such as "political system" and "instrument of domination" images.
In summary, this work confirms the difficulty linking a company with a single image, but it has allowed seeing images that have a greater presence in companies operating in Rio Grande do Sul.
With respect to obtaining organizational patterns, it is necessary to point out that the valuations must be carried out in the context of the specific experience analyzed.Thus, it is important to remember that the data analyzed correspond to a small sample of companies.The sample includes companies from various sectors and sizes, making it difficult to draw conclusions that can be generalized.It is necessary to extend the study with a larger sample size.It would also be interesting to carry out sector analysis to try to identify organizational features which are typical of companies in certain sectors, as well as geographically comparative studies.
In each company a group of up to 4 employees were interviewed; in some cases significant differences in the perception of different employees were observed.It would therefore be interesting to try to analyze these differences in perception, depending on the type of company and the employee profile.
From the point of view of applying soft clustering techniques, another line of research that opens from this work is the application of other soft clustering algorithms, in particular the use of algorithms that do not require the previous definition of the number of groups to be created.
In any case, the study has served to demonstrate the usefulness of the methodology proposed and to draw some conclusions about organizational images that seem to have a presence in Brazilian companies.Appendix: Questionnaire for the identification of organizational images 1) Procedures, operations and processes are standardized.
2) Changes in the organization are normally a reaction to changes that already occurred in the macro business environment.
3) Administrators frequently talk about authority, power and superior-subordinate relationships.
4) Flexible and creative action.5) Working in inadequate circumstances and conditions is considered a proof of loyalty to the organization.
6) The organization sees itself as a part of a larger system where there is an interdependence that involves the community, suppliers and the competition.9) The organization evolves in harmony and balance with its macro environment.10) People act under constant stress and pressure.11) There is constant questioning and redirection of actions.12) Power serves to provide discipline and achieve order in conflicts of interest.
13) The organization considers the motivations and needs of people.
14) There are rigid patterns and uniformity in people's behavior.
15) The company has and utilizes a great number of rules, norms and regulations about operational aspects of the business.17) The delegation of power to operational levels tends to be very restricted.18) Negative feedback is encouraged to correct the organizational direction.
19) The organization expects complete devotion and dedication from its employees.
20) The company benefits more from external events (environmental, etc.) than from strict planning.21) There are many taboos and prejudices in the organization.
22) The relationships between superiors and subordinates tend to contain elements of love and hate.
23) Long term achievements will be achieved in partnership with the forces acting with the macro-environment and not against it.24) To dismiss people and streamline activities are part of the game.25) Most people think about and influence on the destiny of the company.26) Interpersonal gossip consumes energy and diverts attention from productivity.27) Organizational objectives and people's needs can be met simultaneously.
28) The organization is a realm of bureaucracy.
29) The organization is expected to operate in a routine, efficient, reliable and predictable manner.30) Employees are seen as valuable resources who can offer rich and varied contributions to the organizations activities, provided that the organization attends to their needs and motivations.31) Rumors and gossip are frequent.
32) The organization tends to offer quick answers to changes in their macroenvironment.
33) The organization values executives who appear framed and faithful to the mode of being of the company 34) In strategic decision making the company normally abandons the simple view and prefers to take into account the complexity of the situation.35) People are dedicated to the organization because they feel they belong to something greater, which transcends their existence and individual limitations.

7)
People and groups tend to display infantile behavior.8) Past achievements are constantly cited as references and as examples on how to deal with present situations and how to face future adversities.
Fuzzy Clustering: Application on Organizational Metaphors in Brazilian Companies

Table 1 .
Relationships between questions and organizational images.

Table 2 .
Example of answers to the questionnaire.

Table 6 .
Cluster sizes in the closest hard clustering.