Dokumentation (english)

Violin Plot

Visualize distribution density and statistical summary

Use me when a box plot feels too boxy - I'll show you the full elegant shape of your distribution. Like a box plot that went to art school, I reveal not just quartiles but the actual curves and bumps of your data. See multiple peaks? Asymmetric spread? Smooth or choppy distribution? I combine statistical rigor with visual beauty - perfect when you want both precision and elegance.

Overview

A violin plot combines a box plot with a kernel density plot, showing the full distribution shape of the data. The width of the violin at different values indicates the density of data at those values. It provides more information than a box plot by showing the probability density of the data at different values.

Best used for:

  • Comparing distributions across multiple groups
  • Showing multimodal distributions (multiple peaks)
  • Visualizing data density and shape
  • Combining statistical summary with distribution detail
  • Understanding data spread and concentration
  • Detecting distribution asymmetry and skewness

Common Use Cases

Statistical Analysis

  • Distribution comparison across experimental groups
  • Response time analysis across conditions
  • Survey response distributions by demographic
  • Test score distributions by class or school
  • Measurement variation across different setups

Quality Control & Manufacturing

  • Process output distribution by machine or shift
  • Measurement precision across instruments
  • Product dimension distributions
  • Quality metrics by production batch
  • Performance consistency analysis

Business Analytics

  • Sales distribution by region or product
  • Customer lifetime value distributions
  • Response rates across campaigns
  • Pricing sensitivity across segments
  • Performance metrics by team

Understanding Violin Plot Components



     ╱───╲       ← Top of distribution
    │     │      ← Density shape (wide = more data)
    │  ─  │      ← Median line (if shown)
    │ ┌─┐ │      ← Box plot (if shown)
     ╲───╱       ← Bottom of distribution

  • Width: Indicates data density at that value
  • Symmetry: Symmetric around center axis
  • Multiple peaks: Show multimodal distributions
  • Box plot overlay: Shows quartiles and median

Comparison: Violin vs Box Plot

Violin Plot Advantages:

  • Shows full distribution shape
  • Reveals multimodal distributions (multiple peaks)
  • Displays density at all levels
  • Better for understanding data concentration

Box Plot Advantages:

  • Simpler, easier to read quickly
  • Clear outlier identification
  • Standard statistical measures (quartiles)
  • Less cluttered with many categories

Best Practice: Use violin plots for detailed distribution analysis, box plots for quick statistical summaries.

Tips for Effective Violin Plots

  1. Choose Based on Sample Size:

    • Large samples (>100): Violin plots shine
    • Small samples (<30): Box plots may be more appropriate
    • Very large samples: Consider density plots or histograms
  2. Overlay Box Plots:

    • Enable "Show Box Plot" for best of both worlds
    • Provides familiar statistical markers
    • Makes quartiles immediately visible
  3. Use Split Violins for Comparison:

    • Perfect for before/after or treatment/control comparisons
    • Requires exactly 2 groups
    • Shows differences more clearly than side-by-side
  4. Scale Mode Strategy:

    • Width mode: When comparing distributions across equal groups
    • Count mode: When sample sizes differ significantly
  5. Show Individual Points:

    • For small to medium datasets (< 500 points)
    • Helps verify distribution shape
    • Reveals discrete vs continuous data
  6. Handle Outliers Wisely:

    • Show outliers to identify data quality issues
    • Consider if outliers are real or errors
    • Use log scale if outliers compress main distribution

Example Scenarios

Distribution Comparison Across Groups

Split Violin (Treatment vs Control)

Violin with Box Plot Overlay

Multiple Variables Comparison

Options

X-Axis

Optional - Select a categorical column to group violins.

When specified, creates separate violin plots for each category, allowing side-by-side comparison. Leave empty for a single violin.

Y-Axis

Required - Select one or more numerical columns to analyze.

The values in these columns will be used to create the violin plot distribution. You can select multiple columns to compare their distributions.

Aggregation Column

Optional - Apply aggregation before creating violin.

If you need to aggregate your data first, specify the column and aggregation function here.

Column

Select the column to aggregate.

Aggregation Function

Options:

  • Sum, Count, Mean, Median, Min, Max, Std, Var, First, Last

Settings

Hide Empty Values

Optional - Exclude categories with no data.

Use Logarithmic Scale For X Axis

Optional - Apply log scale to X-axis.

Use Logarithmic Scale For Y Axis

Optional - Apply log scale to Y-axis.

Useful when data values span orders of magnitude.

Show Box Plot

Optional - Overlay a box plot inside the violin.

Shows the quartiles, median, and whiskers inside the violin for additional statistical context.

Display Meanline

Optional - Show a line indicating the mean value.

All Points

Optional - Show all individual data points.

Overlays all data points on the violin, useful for seeing actual data distribution and sample size.

Outliers

Optional - Show outlier points.

Linear

Optional - Use linear method for quartile calculation.

Exclusive

Optional - Use exclusive method for quartile calculation.

Inclusive

Optional - Use inclusive method for quartile calculation.

Violin Mode

Optional - How to display multiple violins.

Options:

  • Group (Side-by-side) - Default, best for comparison
  • Overlay (On top) - Useful with transparency

Split Violin

Optional - Create split violin for two-group comparison.

Requires exactly 2 categories. Shows both distributions on the same violin, split down the middle for easy comparison.

Opacity

Optional - Transparency of violin fill.

Options:

  • 100%, 80%, 65%, 50%

Scale Mode

Optional - How to scale violin width.

Options:

  • Width (Same width) - All violins same width
  • Count (Scale by count) - Width reflects sample size

Jitter

Optional - Random horizontal offset for overlapping points.

Options:

  • None (0.0), Low (0.1), Medium (0.3), High (0.5)

Point Position

Optional - Horizontal position of data points relative to violin.

Options:

  • Far Left (-1.0), Left (-0.5), Center (0.0), Right (0.5), Far Right (1.0)

Violin Gap

Optional - Space between violins.

Options:

  • None (0.0), Small (0.1), Medium (0.2), Large (0.3)

Show Legend

Optional - Display legend for multiple series.

Troubleshooting

Issue: Violin looks strange or has gaps

  • Solution: May indicate discrete data or insufficient data points. Check data type and sample size. Consider histogram instead for small samples.

Issue: Can't see the density shape clearly

  • Solution: Increase sample size (violins need 50+ points for smooth shape), adjust bandwidth parameter if available, or use box plot for small samples.

Issue: Split violin not working

  • Solution: Ensure X-axis column has exactly 2 unique categories. More or fewer categories won't work with split violin mode.

Issue: Violins overlap and are hard to read

  • Solution: Increase "Violin Gap" setting, switch to overlay mode with reduced opacity, or consider faceting for many categories.

Issue: Width doesn't reflect density correctly

  • Solution: Check "Scale Mode" setting. "Width" mode makes all violins same maximum width, while "Count" mode scales by sample size.

Issue: Points are all on top of each other

  • Solution: Increase jitter value to add random horizontal spread (try 0.3-0.5), or adjust point position if needed.

Issue: Can't compare violins to each other

  • Solution: Enable box plot overlay for quartile reference, ensure same scale mode across all violins, sort by median for easier comparison.

Issue: Distribution looks multimodal but shouldn't be

  • Solution: Check for mixed populations in data, verify data quality, or investigate if multimodality is real signal vs noise.

Issue: Need statistical test results

  • Solution: Violin plots are exploratory. Use notched box plots for confidence intervals, or perform separate statistical tests (t-test, ANOVA).

Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor 1 Tag
Release: v4.0.0-production
Buildnummer: master@64a3463
Historie: 68 Items