What is a histogram?
The histogram is probably the most popular graph used in statistical analysis, considered one of the basic tools for quality improvement.
This graph allows graphic representation of the so-called distribution of the examined feature in specific intervals. Simply put – the histogram presents the multiplicity of the occurrence of given values (eg age, temperature, size, weight, earnings, education, etc.), in the ranges we are interested in (from … to …).
And visually …
Visually, it is a simple bar graph, where the x-axis of the data of interest are marked on the x-axis, and the y-axis is presented in multiples of their occurrence. On the other hand, the values are marked in the form of adjoining bars.
How should a single rectangle be interpreted?
Let’s use the example with temperature again …
In the range between 21 and 22 degrees there are 3 bars. This means that each bar presents a value of approx. 0.34 degrees. More specifically, the temperature of 21.34 degrees, marked as a green bar, was noted 12 times in the period under consideration.
How to prepare a histogram?
Collect the results
It’s best to collect over 50 results, then we have a better chance of capturing variability in the entire population studied.
No less important is that the values are collected at random, and the same way of collecting results proceeded the same in every case. Thanks to this, we can be sure that we can put together on a single chart all collected data that properly represent the actual picture of the situation.
Determine the intervals in which you will present the data (ie the number of bars)
At the beginning it is easiest to determine the value of min and max based on the collected results. Thanks to this we also know what the data range is.
As for the number of intervals, which will be represented by individual bars in the histogram, it is not so easy. The possibilities to calculate the number of intervals are at least several: division according to DIN 55302, standard E41.32.110N, as the root of the number of results rounded to the nearest integer, etc.
However, if we do not write a thesis and the results of our analysis are to be excluded for the internal needs of the organization, then we can determine the ranges of values that are of interest to us and develop a histogram on this basis.
Specify the frequency of occurrence of given values in relation to the designated intervals
It is nothing else but the grouping of all data falling into individual intervals, and then counting how many values there are in each group.
On this basis, generate a column chart. The obtained graph is just a histogram.
Interpretation of the histogram – Gaussian curve, variability, shape, evaluation of the ability of the process
Based on the generated histogram you can draw many conclusions, you just need to know the basics of the interpretation of this chart.
On histograms 1 and 2, a Gaussian curve (red line) is marked, i.e. a line representing the normal distribution of the examined quantity. Thanks to this, you can analyze in a visual way, among others changeability of the features that interest us.
So, as for the degree of variability of the measured diameter [mm] in the example, it is bigger in the 2nd case. This is evidenced by the width of the histogram (compare the width of the Gaussian curve in histogram 1 and 2).
The first example also contains an interesting element, namely one of the measurement points deviates significantly from the others. Analyzing this case, do not take this measurement into account, because it is the so-called special effect, i.e. a measurement that can be affected by an error.
The shape of the Gauss curve also says a lot about the process or the size we measure. The narrower the curve, the smaller the standard deviation. In practice, this means that the measured quantity, even in the case of certain deviations, has a greater chance of “fitting in” within the tolerance of the given process.
On this basis, the process capability is assessed (Cp, Cpk). More specifically, this topic will be described in the article on 6 sigma.
Where is the histogram applicable? Benefits
A histogram prepared on the basis of the data we are interested in provides us with information about what type of statistical distribution we are dealing with.
Great use is found in quality. Based on it, the process capability is assessed, experiments are planned, and the impact of some factors on the process is analyzed.
The use of histogram and statistical methods is enormous and also concerns other areas of the organization’s functioning, not only production and quality. Business decisions, based on reliable data and statistical analyzes, usually bring tangible benefits.
Histogram in HR:
- Selection of potential candidates
- Assessment of employees’ competences
- Measurement of effectiveness: training, recruitment, incentive and bonus systems, employee assessment and employee initiatives
- Evaluation of communication channels (formal, informal) in the organization
- Evaluation of changes in organizational structures
Histogram in IT:
- Evaluation of the implementation of the new system
- Analysis of Helpdesk applications
- Measuring the effectiveness of IT processes
Histogram in Sales:
- Assessment of customer satisfaction
- Evaluation of sales competences
- Measuring the effectiveness of the sales department
- Assessment of complaints, non-compliance
What are the business benefits of using the use of a histogram?
By analyzing the histogram with the Gauss curve drawn, we can obtain important information about the subject under investigation. Any deviation from the normal distribution shape may suggest that:
- the study did not include any significant factor, which resulted in distorted results (for example: we examined the professional effectiveness of only newly employed employees, not all)
- we have studied a specific group where the level of interest is more than average (for example, engineers are characterized by a high level of analytical thinking)
The compatibility of the distribution of results with the Gaussian curve suggests that the sample we are looking at is representative and properly depicts the entire population. This means that our actions can be “targeted”, tailored to the business needs of the organization.
The histogram and normal distribution depicted by the Gaussian curve thus plays a key role in solving business problems by means of statistical analyzes. It can support other methods, such as DMAIC.
The knowledge we receive when analyzing the histogram is extremely useful in everyday business practice. Knowing the simple rules of its interpretation, we can properly formulate business questions, which in turn translates into a more accurate selection of actions aimed at achieving the assumed business goals.