Histogram

Use me when you want to see the shape of your data - where the mountains and valleys are. I'll group your numbers into bins and show you the distribution. Are most values clustered in the middle? Skewed to one side? Multiple peaks? I'll reveal if your data is bell-shaped, wonky, or hiding surprises.

Overview

A histogram is a graphical representation of the distribution of numerical data. It groups values into bins (intervals) and displays the frequency or count of observations falling into each bin using bars.

Best used for:

Understanding data distribution patterns (normal, skewed, bimodal)
Identifying the central tendency and spread of data
Detecting outliers and unusual patterns
Comparing distributions across different groups
Quality control and process monitoring

Common Use Cases

Statistics & Data Analysis

Age distribution in a population
Test score distributions
Income or salary ranges
Measurement error analysis

Quality Control & Manufacturing

Measurement variation analysis
Process capability studies
Defect distribution patterns
Tolerance compliance checking

Data Science & Machine Learning

Feature distribution analysis before modeling
Identifying need for data transformations
Detecting skewness and outliers
Understanding target variable distribution

Options

Target Columns

Required - Select one or more numerical columns to visualize.

You can add multiple columns to compare their distributions side-by-side on the same plot. Each column will be shown in a different color.

Note: You can add multiple columns using the "+" button to compare distributions.

Settings

Show Frequency

Optional - Display count or frequency on Y-axis.

On: Shows actual count of observations in each bin
Off: Shows probability density (normalized)

Show Legend

Optional - Display legend when multiple columns are shown.

Useful when comparing distributions of multiple variables.

Show Axis Labels

Optional - Display axis labels.

Annotate Bars

Optional - Show values on top of each bar.

Displays the count or frequency for each bin directly on the histogram.

Show KDE

Optional - Overlay a Kernel Density Estimate curve.

A KDE provides a smooth, continuous estimate of the probability density function.

Number of Bins

Optional - Specify how many bins to use.

Enter a number to control the granularity of the histogram. More bins show more detail but may introduce noise; fewer bins show broader patterns.

Auto-calculated if not specified using Sturges' rule or Freedman-Diaconis rule.

Bin Size

Optional - Specify the width of each bin.

Alternative to "Number of Bins". Sets a fixed width for bins (e.g., bins of width 5 for ages: 0-5, 5-10, 10-15, etc.).

Cumulative

Optional - Show cumulative distribution.

Instead of showing frequency in each bin, shows cumulative frequency up to that bin.

Normalization

Optional - How to normalize the histogram.

Options:

None - Show raw counts
Probability - Normalize so bars sum to 1
Probability Density - Normalize to show probability density
Percent - Show as percentages

Histogram Function

Optional - Statistical function to apply.

Options:

Count - Number of observations (default)
Sum - Sum of values
Average - Mean of values
Min - Minimum value
Max - Maximum value

Bar Mode

Optional - How to display multiple histograms.

Options:

Overlay - Overlay histograms with transparency
Group - Place bars side-by-side
Stack - Stack bars on top of each other

Opacity

Optional - Transparency of bars (0-1).

Lower values make bars more transparent, useful when overlaying multiple distributions.

Understanding Distributions

Normal Distribution (Bell Curve)

Symmetric distribution with most values near the mean.

Characteristics:

Symmetric around the mean
Mean ≈ Median ≈ Mode
68% of data within 1 standard deviation
95% within 2 standard deviations

Right-Skewed Distribution

Long tail on the right side.

Characteristics:

Mean > Median
Common in: income data, response times, sizes
May need log transformation for analysis

Left-Skewed Distribution

Long tail on the left side.

Characteristics:

Mean < Median
Less common than right-skewed
Example: test scores (when most score high)

Bimodal Distribution

Two distinct peaks.

Characteristics:

Two modes (peaks)
Suggests two different groups or processes
Consider separating and analyzing groups individually

Uniform Distribution

Approximately equal frequency across bins.

Characteristics:

Flat appearance
All values equally likely
Example: random number generators

Tips for Effective Histograms

Choose Appropriate Bins:
- Too few bins hide important features
- Too many bins create noise
- Start with auto-calculated bins, then adjust
Consider Bin Width:
- Use meaningful intervals (e.g., $10,000 for income, 5 years for age)
- Ensure bins don't hide important patterns
Handle Outliers:
- Outliers can compress the main distribution
- Consider filtering extreme values or using log scale
- Or show outliers separately
Compare Distributions:
- Use overlay mode with transparency
- Or use small multiples (facets)
- Normalize when counts differ greatly
Add Context:
- Show mean/median lines
- Add KDE for smooth overview
- Annotate important bins
Check for Artifacts:
- Gaps might indicate data collection issues
- Spikes might indicate rounding or discrete values
- Verify patterns make domain sense

Troubleshooting

Issue: Distribution looks choppy or irregular

Solution: Increase number of bins or use KDE for smoother view

Issue: Can't see the pattern

Solution: Try log scale, adjust bin size, or filter outliers

Issue: Multiple distributions hard to compare

Solution: Use normalization (probability or percent) so heights are comparable

Issue: Bars are too thin or wide

Solution: Adjust number of bins or specify custom bin size

Issue: Peak is cut off

Solution: Check Y-axis range in advanced settings

Issue: Data appears discrete but using continuous bins

Solution: Adjust bins to align with discrete values (e.g., integer ages)

Histogram

Overview

Common Use Cases

Statistics & Data Analysis

Quality Control & Manufacturing

Data Science & Machine Learning

Options

Target Columns

Settings

Show Frequency

Show Legend

Show Axis Labels

Annotate Bars

Show KDE

Number of Bins

Bin Size

Cumulative

Normalization

Histogram Function

Bar Mode

Opacity

Understanding Distributions

Normal Distribution (Bell Curve)

Right-Skewed Distribution

Left-Skewed Distribution

Bimodal Distribution

Uniform Distribution

Tips for Effective Histograms

Troubleshooting

On this page

Sicherheit auf Enterprise-Niveau

In jeder Infrastruktur einsetzbar

DSGVO-konform

Histogram

Overview

Common Use Cases

Statistics & Data Analysis

Quality Control & Manufacturing

Data Science & Machine Learning

Options

Target Columns

Settings

Show Frequency

Show Legend

Show Axis Labels

Annotate Bars

Show KDE

Number of Bins

Bin Size

Cumulative

Normalization

Histogram Function

Bar Mode

Opacity

Understanding Distributions

Normal Distribution (Bell Curve)

Right-Skewed Distribution

Left-Skewed Distribution

Bimodal Distribution

Uniform Distribution

Tips for Effective Histograms

Troubleshooting

On this page

Command Palette