Breaking changes: The following changes, mostly due to new default density estimators, may cause some plots on sample data to change. Changes should usually be small, and generally should result in more accurate density estimation. Revert to the old behavior by setting
density = density_unbounded(bandwidth = "nrd0").
density_bounded()as its default density estimator, which uses a bounded density estimator that also estimates the bounds of the data. The default bandwidth estimator is also now
bandwidth_dpi(), which is the Sheather-Jones direct plug-in estimator (the same as
stats::bw.SJ(..., method = "dpi")). These changes may cause existing charts using densities to change; usually only slightly. These changes should be worth it, as they should drastically improve the accuracy of density estimates, especially on bounded data, and should have little noticeable impact on densities on unbounded data.
density_bounded()now estimates bounds from the data when not provided (i.e. when one of
NA). See the
bounder_cooke()) for more on bounds estimation.
hdi()estimators based on bounded density estimator.
New features and enhancements:
hdci()estimator using quantile estimation.
density_histogram(), a histogram density estimator. Finer-grained control of bin positions is now possible using the
breaksargument (including the new
breaks_fixed()for manually-specified bin widths) and the
alignargument (including the new
align_center()for choosing how to align bin positions to reference points). (#118)
stat_spike()for adding spike annotations to slabs created with
stat_slabinterval(). See example in
vignette("slabinterval"). (#58, #124)
parse_dist()now outputs distributional objects in a
.dist_objcolumn in addition to the name-plus-arguments (
.args) format, and these objects respect truncation parameters from prior specifications. This makes it easier to visualize standard deviation priors, for example, giving a better solution to #20.
.dist_objcolumn output by
parse_dist()in addition to the
geom_lineribbon()now obeys the
orderaesthetic, allowing you to arbitrarily set the draw order of ribbons (#171). Enabled by this change,
order = after_stat(level)by default, making its draw order more correct by ensuring all ribbons of the same level are drawn together.
adaptparameter; note that it is unsupported and both the implementation and interface are highly likely to change.
stat_slabinterval()is now deprecated in favor of mapping the corresponding computed variable (
cdf) onto the desired aesthetic. For
slab_type = "histogram", use the
density_histogram()density estimator (e.g. set
density = "histogram"). (#165)
geom_lineribbon()draw order is now correct even when some portions of a ribbon has
New features and enhancements:
dist_bernoulli(), and the upcoming
layout = "hex"allows a hexagonal circle-packing style layout (#161).
smooth = "bounded"/
smooth = "unbounded"(for “density dotplots”) and
smooth = "discrete"/
smooth = "bar"(for improved layout of large-n discrete distributions). (#161)
overlaps = "keep"option disables bin/dot nudging in
"weave"layouts. This means
layout = "weave"with
overlaps = "keep"will yield exact dot positions. (#161)
"weave"layout now works properly with
side = "both"
binwidthof 1 for discrete distributions (#159)
overflow = "compress"allows layouts to be compressed to fit into the geom bounds if a user-specified
binwidthwould otherwise cause the dots to exceed the geom bounds. (#162)
geom_weave(). Both can be used to quickly create “beeswarm”-like plots.
scale_side_mirrored(), makes it easier to create mirrored slabs and dotplots. (#142)
densityargument, including a new bounded density estimator (
linewidthaesthetics in ggplot2 3.4, the following aesthetics have been updated (#138):
p_()functions can be used to generate
after_stat()expressions in terms of ggdist computed variables; e.g.
aes(thickness = !!Pr_(X <= x))maps the CDF of the distribution onto the
aes(thickness = !!p_(x))maps the PDF onto the
point_interval()on grouped data frames. (#154)
stat()have been replaced with
after_stat()to be consistent with the deprecation of
stat()in ggplot2 3.4.
New features and enhancements:
stat_slabinterval()can now be shared across sub-geometries:
levelcomputed variables can now be used in slab / dots sub-geometries. These values correspond to the smallest interval computed in the interval sub-geometry containing that portion of the slab. This gives a more flexible alternative to using
cut_cdf_qi()to create densities filled according to a set of intervals (this approach which also works on highest-density intervals, which
cut_cdf_qi()does not). Examples in
vignette("slabinterval")have been updated to use the new approach, and an example has been added to
vignette("dotsinterval")showing how to color dots by intervals.
options(ggdist.experimental.slab_data_in_intervals = TRUE), the
cdfcomputed variables can now be used in interval sub-geometries to get the PDF and CDF at the point summary.
cdf_maxalso give the PDF and CDF at the lower and upper ends of the interval. An example in
vignette("lineribbon")shows how to use this to make lineribbon gradients whose color approximates density (as opposed to the classic gradient fan chart examples already in that vignette, where color approximates the CDF).
scale_thickness_shared()is now provided to allow the thickness scale to be shared across geometries, making certain plot types easier to create (e.g. plots of prior and posterior densities together). See
vignette("slabinterval")for an example.
thicknessis less than 0 it is normalized to have a minimum of zero when normalization is turned on; this makes it easier to use slab functions that go below zero. A new example in
vignette("slabinterval")shows how to use this to create raindrop plots.
geom_dotsinterval(layout = "bin")can now be set using the
orderaesthetic. This makes it possible to create “stacked” dotplots by mapping a discrete variable onto the
orderaesthetic (#132). As part of this change,
bin_dots()now maintains the original data order within bins when
layout = "bin". See an example in
verbose = TRUEflag in
geom_dotsinterval()outputs the selected
binwidthin both data units and normalized parent coordinates. This may be useful if you want to start with an automatically-selected bin width and then adjust it manually. Though note: if you just want to scale the selected bin width to fit within a desired area, it is probably better to use
scale, and if you want to provide constraints on the bin width, you can pass a 2-vector to
stat_slabinterval()can now take a length-two logical vector to control expansion to the lower and upper limits respectively (#129). Thanks to @teunbrand.
geom_dotsinterval()now supports the
familyaesthetic for setting the font used to display its dots (based on a conversation with @gdbassett).
guide_rampbar()for creating gradient-like legends for continuous color/fill ramp scales, based on
ggplot2::guide_colorbar(). See an example in
NAs in the
thicknessaesthetic of a slab, these are now rendered as gaps in the slab (#129).
stat_slabinterval(), a function with that name will be searched for in the calling environment and the
ggdistpackage environment. The latter ensures that
stats work when ggdist is loaded but not attached to the search path (#128).
New features and enhancements:
stat_dist_...families of stats have been merged (#83).
stat_dist_...stats are deprecated in favor of their
stat_...counterparts, which now understand the
ydistcan now be used in place of the
distaesthetic to specify the axis one is mapping a distribution onto (
distmay be deprecated in the future).
yaesthetics now raise a helpful error message suggesting you probably want to use
stat_slabinterval()allows explicitly setting whether or not the slab is expanded to the limits of the scale (rather than implicitly setting this based on
point_interval()family of functions can now be passed
posterior::rvar()objects, meaning that means and modes (in addition to medians) and highest-density intervals (in addition to quantile intervals) can now be visualized for analytical distributions.
rvars will generate a
.indexcolumn when passed to
point_interval()functions (#111). Based on a suggestion from @mitchelloharawild.
stat_ribbon()provided as a shortcut stat for
stat_lineribbon()with no line (#48). Also, if you supply only an
geom_lineribbon(), you will get ribbons without a line (#127).
ul()(upper limit) or
ll()(lower limit), e.g. with
point_interval()explicitly or via
1.07for dot sizes is now exposed as the default value of the
dotsizeparameter instead of being applied internally. This fudge factor tends (in my opinion) to make dotplots look a bit better due to the visual distance between circles, but is (I think) better used as an explicit value than an implicit one, hence the change. This may create subtle changes to plots that use the
stackratioparameters, but allows those parameters to have a more precise geometric interpretation.
geom_slabinterval()family: each shortcut stat/geom now has its own documentation page that comprehensively lists all parameters, aesthetics, and computed variables, including those pulled in via
...from typically-paired geoms. These docs are auto-generated and should be easy to maintain going forward. (#36)
geom_lineribbon()family now also has separate documentation pages with a comprehensive listing of aesthetics and parameters (#107).
vignette("slabinterval")using the new
Deprecations and removals:
.probargument, a long-deprecated alias for
.width, was removed.
stat_slabinterval()were removed: these were largely internal-use parameters only needed by subclasses of the base class for creating shortcut stats, yet added a lot of noise to the documentation, so these were replaced with the
$compute_intervals()methods on the new
AbstractStatSlabintervalinternal base class.
NAs for analytical distributions.
"weave"layouts could be incorrect with aesthetics mapped at a sub-bin level.
stackratios that are not equal to
1are now accounted for in
find_dotplot_binwidth()(i.e. automatic dotplot bin width selection).
fill_rampaesthetic ramps them to the same color.
distributional>= 0.2.2.9000 (#91).
linearGradient()function on R < 4.1.
geom_slabinterval()family geoms when using
position_dodge()is now slightly different in order to match up with how other geoms are positioned (#85). This may slightly change existing charts that use
position = "dodge", and in some cases may cause slabs to be drawn slightly outside plot boundaries, but makes it much easier to combine
geom_slabinterval()with other geoms in the expected way. If dodging more similar to the old approach is needed, use the new “justification-preserving dodge”,
position_dodgejust(), in place of
scalecan now be used as aesthetics instead of parameters, allowing them to vary across slabs within the same geom.
fills within a slab in
geom_slabinterval()can now be drawn as true gradients rather than segmented polygons in R >= 4.1 by setting
fill_type = "gradient". This substantially improves the appearance of gradient fills in graphics engines that support it (#44).
stat_dist_slabinterval()and company now detect discrete distributions and display them as histograms (#19).
geom_dotsinterval()now adjusts bin widths on discrete distributions when they would result in bins that are taller than the allocated space to ensure that they fit within the required space (#42).
geom_dotsinterval()bin width by passing a vector of two values to the
geom_dotsinterval()has been factored out and exported as
bin_dots()for others to use (#77).
curve_interval()used a common (but naive) approach to finding a cutoff on data depth to identify the X% “deepest” curves, simply taking the envelope around the X% quantile of curves ranked by depth. This is quite conservative and tends to create intervals that are too wide;
curve_interval()now searches for a cutoff in data depth such that X% of curves are contained within its envelope (#67).
point_interval()and company now accept
posterior::rvar()s (full support for
Substantial improvements to the documentation of aesthetics and computed variables in
stat_slabinterval(), and company, listing all custom aesthetics, computed variables, and their usage.
Several new examples in
vignette("slabinterval"), including “rain cloud” plots and an example of histograms for discrete analytical distributions.
stat_dist_slabinterval()preserves group order (#88).
NAhandling across the geoms (#74, #51).
"swarm"layouts for dots geoms (#64). These provide alternative layouts that keep datapoints in their actual positions on the data axis. The
"weave"layout maintains rows but not columns and works well for quantile dotplots; the
"swarm"layout uses the
beeswarm::beeswarm()(courtesy James Trimble) and works well on sample data. See the dotplot section of
unit()to specify bin widths manually for dots geoms and stats, which can be helpful when you need dotplots across facets to have the same bin width (#53).
cdfcomputed variables for the
stat_sample_slabinterval()subfamily. See new examples of usage in the last section of
cut_cdf_qi()for creating (amongst other things) interval-filled halfeyes, in the style of
geom_lineribbon()families, making it easier to separate group colors from interval/density/CDF colors. See new examples in
interval_size_rangeargument in docs (#35)
New features and documentation:
curve_interval()for generating curvewise (joint) intervals for curve boxplots (#22)
stat_dist_...geoms now calculate
cdfcolumns to allow mashup geoms that involve both functions, such as Correll-style gradient plots combined with violins, as in Helske et al. (#11).
stat_dist_...geoms should now work with
se_fit = FALSE.
vignette("freq-uncertainty-vis"). Tidybayes will retain all other functions, and will re-export all
ggdistfunctions for now.
h-suffix geoms are now deprecated. Those geoms have been left in
tidybayesand give a deprecation warning when used; they cannot be used from
geom_lineribbon()no longer automatically set the
.upperare present in the data. This allows them to work better with automatic orientation detection (and was a bad feature to have existed in the first place anyway). The deprecated
tidybayes::geom_pointintervalh()still automatically set those aesthetics, since they are deprecated anyway (so supporting the old behavior is fine in these functions).
stat_lineribbon()now supports a
stepargument for creating stepped lineribbons. H/T to Solomon Kurz for the suggestion.
ggdistnow has its own implementation of the scaled and shifted Student’s t distribution (
qstudent_t(), etc), since it is very useful for visualizing confidence distributions.