Courses

Courses for Kids

Free study material

Store

Talk to our experts

1800-120-456-456

Maths

Grouping of Data

Grouping of Data

Reviewed by:

Rama Sharma

What is Grouping Data?

Grouping data plays a significant role when we have to deal with large data. This information can also be displayed using a pictograph or a bar graph. Data is formed by arranging individual observations of a variable into groups so that a frequency distribution table of these groups provides a convenient way of summarising or analyzing the data. This is how we define grouped data.

In mathematics, in the topic grouping data, we basically learn to define grouped data mathematically. When the number of observations is very large, we may condense the data into several groups, by the concept of a grouping of data. We record the frequency of observations falling in each of the groups. The presentation of data in groups along with the frequency of each group is called the frequency distribution of the grouped data.

What are the Advantages of Grouping Data?

The Advantages of grouping data in statistics are-

It helps to focus on important subpopulations and ignores irrelevant ones.
Grouping of data improves the accuracy/efficiency of estimation.
Frequency Distribution Table for Grouped Data
To analyze the frequency distribution table for grouped data when the collected data is large, then we can follow this approach to analyze it easily.

Example

Consider the marks of 50 students of class VII obtained in an examination. The maximum mark of the exam is 50.

23, 8, 13, 18, 32, 44, 19, 8, 25, 27, 10, 30, 22, 40, 39, 17, 25, 9, 15, 20, 30, 24, 29, 19, 16, 33, 38, 46, 43, 22, 37, 27, 17, 11, 34, 41, 35, 45, 31, 26, 42, 18, 28, 30, 22, 20, 33, 39, 40, 32

If we create a frequency distribution table for each and every observation, then it will form a large table. So for easy understanding, we can make a table with a group of observations say 0 to 10, 10 to 20, etc.

The distribution obtained in the above table is known as the grouped frequency distribution. This helps us to bring various significant inferences like:

Many students have secured between 20-40, i.e. 20-30 and 30-40.
8 students have secured higher than 40 marks, i.e. they got more than 80% in the examination.

In the above-obtained table, the groups 0-10, 10-20, 20-30,… are known as class intervals (or classes). It is observed that 10 appears in both intervals, such as 0-10 and 10-20. Similarly, 20 appears in both the intervals, such as 10-20 and 20-30. But it is not feasible that observation of either 10 or 20 can belong to two classes concurrently. To avoid this inconsistency, we choose the rule that the general conclusion will belong to the higher class. It means that 10 belongs to the class interval 10-20 but not to 0-10. Similarly, 20 belongs to 20-30 but not to 10-20, etc.

Consider a class that says 10-20, where 10 is the lower class interval and 20 is the upper-class interval. The difference between upper and lower class limits is called class height or class size or class width of the class interval.

This is how we create a frequency distribution table for grouped data as shown above.

Histogram

We can show the above frequency distribution table graphically using a histogram. We need to consider class intervals on the horizontal axis and we need to consider the frequency on the vertical axis.

Let’s See A Few Grouped Data Examples in Detailed Step-by-Step Explanations.

Example 1. The marks obtained by forty students of class VIII in an examination are listed below:

16, 17, 18, 3, 7, 23, 18, 13, 10, 21, 7, 1, 13, 21, 13, 15, 19, 24, 16, 2, 23, 5, 12, 18, 8, 12, 6, 8, 16, 5, 3, 5, 0, 7, 9, 12, 20, 10, 2, 23

Divide the data into five groups, namely, 0-5, 5-10, 10-15, 15-20 and 20-25, where 0-5 means marks greater than or equal to 0 but less than 5 and similarly 5-10 means marks greater than or equal to 5 but less than 10, and so on. Prepare a grouped frequency table for the grouped data.

Solution: We need to arrange the given observations in ascending order. After arranging them in ascending order we get them as

0, 1, 2, 2, 3, 3, 5, 5, 5, 6, 7, 7, 7, 8, 8, 9, 10, 10, 12, 12, 12, 13, 13, 13, 15, 16, 16,16, 17, 18, 18, 18, 19, 20, 21, 21, 23, 23, 23, 24

Thus, the frequency distribution of the data may be given as follows:

Note: Here, each of the groups that is 0-5, 5-10, 10-15, 15-20, and 20-25 is known as a class interval. In the class interval 10-15, the number 10 is known as the lower limit and 15 is known as the upper limit of the class interval and the difference between the upper limit and the lower limit of any given class interval is known as the class size.

Thus, the class size in the above frequency distribution is equal to 5.

The mid-value of a class is known to be its class mark and the class mark is obtained by adding its upper and lower class limits and dividing the sum by 2.

Thus, the class mark of 0-5 range is equal to (0 + 5)/2 = 2.5

And the class mark of 5-10 range is equal to (5 + 10)/2 = 7.5, etc.

Questions to be Solved:

Question 1)The weights (in kg) of 35 persons are given below:

43, 51, 62,47, 48, 40, 50, 62, 53, 56, 40, 48, 56, 53, 50, 42, 55, 52, 48, 46, 45, 54, 52, 50, 47, 44, 54, 55, 60, 63, 58, 55, 60, 53,58

Prepare a frequency distribution table equal to the class size. One such class is the 40-45 class (where 45 is not included).

FAQs on Grouping of Data

1. What is meant by grouping data in statistics?

Grouping data is the process of organising a large set of individual observations into smaller, more manageable groups or class intervals. Instead of listing every single data point, we summarise the data by counting how many observations fall into each group. This summarised presentation, often shown in a frequency distribution table, makes it much easier to analyse and understand the characteristics of the data.

2. What is the difference between grouped and ungrouped data?

The main difference lies in their presentation and organisation:

Ungrouped Data: This is raw data, presented as a list of individual values. For example, the marks of 10 students: 23, 41, 15, 33, 29, 45, 18, 23, 38, 40. It is difficult to see patterns in this form.
Grouped Data: This is data that has been organised into class intervals. Using the same marks, we could group them into intervals like 10-20, 20-30, 30-40, and 40-50, and then count the number of students in each group (frequency).

Essentially, grouping transforms raw, individual data points into a summarised, structured format.

3. How do you convert raw, ungrouped data into grouped data?

To convert ungrouped data into grouped data, you can follow these steps:

Determine the Range: Find the highest and lowest values in the dataset and calculate the difference (Range = Highest Value - Lowest Value).
Decide the Number of Classes: Choose how many groups (class intervals) you want to create. This decision affects how the data is summarised.
Calculate the Class Width: Divide the range by the number of classes you decided on. This gives you the size of each class interval. It's often rounded up to a convenient number.
Create Class Intervals: Starting from the lowest value, create continuous class intervals using the calculated class width (e.g., 0-10, 10-20, 20-30).
Tally the Frequencies: Go through the raw data and place a tally mark for each value against the class interval it falls into. Finally, count the tallies to find the frequency for each class.

4. Why is grouping data important, especially for large datasets?

Grouping data is crucial for large datasets because it simplifies complexity and reveals underlying patterns that are invisible in raw data. Its main advantages are:

Summarisation: It condenses thousands of data points into a compact frequency table, making the dataset easier to read and interpret.
Pattern Recognition: It helps in identifying the distribution, concentration, and trends within the data. For example, you can quickly see which range has the most or fewest values.
Easier Visualisation: Grouped data is a prerequisite for creating powerful visual tools like histograms, which clearly show the shape of the data's distribution.
Efficient Analysis: It simplifies the calculation of statistical measures like the mean, median, and mode for large datasets.

5. What are the key terms used in a grouped frequency distribution?

When working with grouped data, you will encounter these important terms:

Class Interval: The range or group into which the data is divided, like '10-20' or '20-30'.
Class Limits: Each class interval has a Lower Class Limit (the smallest value) and an Upper Class Limit (the largest value). For the interval 10-20, 10 is the lower limit and 20 is the upper limit.
Class Size (or Width): The difference between the upper and lower class limits. For 10-20, the class size is 10.
Class Mark: The midpoint of a class interval, calculated as (Upper Limit + Lower Limit) / 2. For 10-20, the class mark is (10+20)/2 = 15.

6. Can you provide some real-world examples of grouped data?

Grouped data is used in many real-world scenarios to make sense of large amounts of information. Some common examples include:

Age Demographics: Census data often groups population by age, such as 0-10 years, 11-20 years, 21-30 years, and so on.
Income Brackets: A survey of household income might group the results into brackets like ₹0-₹2 Lakhs, ₹2-₹5 Lakhs, ₹5-₹10 Lakhs, etc.
Exam Scores: A school might analyse student performance by grouping marks, such as 91-100 (A), 81-90 (B), 71-80 (C), and so on.
Height or Weight Measurement: In a health survey, heights of individuals might be grouped into intervals like 150-155 cm, 155-160 cm, 160-165 cm.

7. What is the key difference between a histogram and a bar graph?

While both are graphical representations using bars, a histogram and a bar graph are used for different types of data and have a key visual difference:

Type of Data: A histogram is used to represent the frequency distribution of continuous data that has been grouped into class intervals (e.g., heights, weights, marks). A bar graph is used to represent discrete, categorical data (e.g., favourite colours, types of pets, months).
Bars: In a histogram, the bars are drawn adjacent to each other with no gaps in between, signifying that the data is continuous. In a bar graph, there are equal spaces between the bars, as each bar represents a separate, distinct category.

8. How does the choice of class width affect the interpretation of grouped data?

The choice of class width (or class size) is critical because it can significantly change the story the data tells.

If the class width is too wide, too much data gets lumped together. This can hide important details and patterns, giving a very flat and uninformative look to the data distribution.
If the class width is too narrow, you create too many small groups, many of which might have zero or very low frequencies. This can create a lot of noise and make it difficult to see the overall trend or shape of the distribution.

Therefore, selecting an appropriate class width is a balance between summarising the data enough to see a clear pattern and retaining enough detail to be meaningful.