Shortcut version of stat_slabinterval()
with geom_slabinterval()
for
creating histogram + interval plots.
Roughly equivalent to:
stat_slabinterval(
density = "histogram"
)
Usage
stat_histinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
density = "histogram",
p_limits = c(NA, NA),
adjust = waiver(),
trim = TRUE,
expand = FALSE,
breaks = waiver(),
align = "none",
outline_bars = FALSE,
point_interval = "median_qi",
slab_type = NULL,
limits = NULL,
n = 501,
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE
)
Arguments
- mapping
Set of aesthetic mappings created by
aes()
. If specified andinherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot. You must supplymapping
if there is no plot mapping.- data
The data to be displayed in this layer. There are three options:
If
NULL
, the default, the data is inherited from the plot data as specified in the call toggplot()
.A
data.frame
, or other object, will override the plot data. All objects will be fortified to produce a data frame. Seefortify()
for which variables will be created.A
function
will be called with a single argument, the plot data. The return value must be adata.frame
, and will be used as the layer data. Afunction
can be created from aformula
(e.g.~ head(.x, 10)
).- geom
Use to override the default connection between
stat_histinterval()
andgeom_slabinterval()
- position
Position adjustment, either as a string, or the result of a call to a position adjustment function. Setting this equal to
"dodge"
(position_dodge()
) or"dodgejust"
(position_dodgejust()
) can be useful if you have overlapping geometries.- ...
Other arguments passed to
layer()
. These are often aesthetics, used to set an aesthetic to a fixed value, likecolour = "red"
orlinewidth = 3
(see Aesthetics, below). They may also be parameters to the paired geom/stat. When paired with the default geom,geom_slabinterval()
, these include:normalize
How to normalize heights of functions input to the
thickness
aesthetic. One of:"all"
: normalize so that the maximum height across all data is1
."panels"
: normalize within panels so that the maximum height in each panel is1
."xy"
: normalize within the x/y axis opposite theorientation
of this geom so that the maximum height at each value of the opposite axis is1
."groups"
: normalize within values of the opposite axis and within each group so that the maximum height in each group is1
."none"
: values are taken as is with no normalization (this should probably only be used with functions whose values are in [0,1], such as CDFs).
For a comprehensive discussion and examples of slab scaling and normalization, see the
thickness
scale article.fill_type
What type of fill to use when the fill color or alpha varies within a slab. One of:
"segments"
: breaks up the slab geometry into segments for each unique combination of fill color and alpha value. This approach is supported by all graphics devices and works well for sharp cutoff values, but can give ugly results if a large number of unique fill colors are being used (as in gradients, like instat_gradientinterval()
)."gradient"
: agrid::linearGradient()
is used to create a smooth gradient fill. This works well for large numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices. As of this writing, thepng()
graphics device withtype = "cairo"
, thesvg()
device, thepdf()
device, and theragg::agg_png()
devices are known to support this option. On R < 4.1, this option will fall back tofill_type = "segments"
with a message."auto"
: attempts to usefill_type = "gradient"
if support for it can be auto-detected. On R >= 4.2, support for gradients can be auto-detected on some graphics devices; if support is not detected, this option will fall back tofill_type = "segments"
(in case of a false negative,fill_type = "gradient"
can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will always fall back tofill_type = "segments"
, in which case you can setfill_type = "gradient"
explicitly if you are using a graphics device that support gradients.
interval_size_domain
A length-2 numeric vector giving the minimum and maximum of the values of the
size
andlinewidth
aesthetics that will be translated into actual sizes for intervals drawn according tointerval_size_range
(see the documentation for that argument.)interval_size_range
A length-2 numeric vector. This geom scales the raw size aesthetic values when drawing interval and point sizes, as they tend to be too thick when using the default settings of
scale_size_continuous()
, which give sizes with a range ofc(1, 6)
. Theinterval_size_domain
value indicates the input domain of raw size values (typically this should be equal to the value of therange
argument of thescale_size_continuous()
function), andinterval_size_range
indicates the desired output range of the size values (the min and max of the actual sizes used to draw intervals). Most of the time it is not recommended to change the value of this argument, as it may result in strange scaling of legends; this argument is a holdover from earlier versions that did not have size aesthetics targeting the point and interval separately. If you want to adjust the size of the interval or points separately, you can also use thelinewidth
orpoint_size
aesthetics; see sub-geometry-scales.fatten_point
A multiplicative factor used to adjust the size of the point relative to the size of the thickest interval line. If you wish to specify point sizes directly, you can also use the
point_size
aesthetic andscale_point_size_continuous()
orscale_point_size_discrete()
; sizes specified with that aesthetic will not be adjusted usingfatten_point
.arrow
grid::arrow()
giving the arrow heads to use on the interval, orNULL
for no arrows.subguide
Sub-guide used to annotate the
thickness
scale. One of:A function that takes a
scale
argument giving a ggplot2::Scale object and anorientation
argument giving the orientation of the geometry and then returns a grid::grob that will draw the axis annotation, such assubguide_axis()
(to draw a traditional axis) orsubguide_none()
(to draw no annotation). Seesubguide_axis()
for a list of possibilities and examples.A string giving the name of such a function when prefixed with
"subguide"
; e.g."axis"
or"none"
.
- density
Density estimator for sample data. One of:
A function which takes a numeric vector and returns a list with elements
x
(giving grid points for the density estimator) andy
(the corresponding densities). ggdist provides a family of functions following this format, includingdensity_unbounded()
anddensity_bounded()
. This format is also compatible withstats::density()
.A string giving the suffix of a function name that starts with
"density_"
; e.g."bounded"
for[density_bounded()]
,"unbounded"
for[density_unbounded()]
, or"histogram"
fordensity_histogram()
. Defaults to"bounded"
, i.e.density_bounded()
, which estimates the bounds from the data and then uses a bounded density estimator based on the reflection method.
- p_limits
Probability limits (as a vector of size 2) used to determine the lower and upper limits of theoretical distributions (distributions from samples ignore this parameter and determine their limits based on the limits of the sample). E.g., if this is
c(.001, .999)
, then a slab is drawn for the distribution from the quantile atp = .001
to the quantile atp = .999
. If the lower (respectively upper) limit isNA
, then the lower (upper) limit will be the minimum (maximum) of the distribution's support if it is finite, and0.001
(0.999
) if it is not finite. E.g., ifp_limits
isc(NA, NA)
, on a gamma distribution the effective value ofp_limits
would bec(0, .999)
since the gamma distribution is defined on(0, Inf)
; whereas on a normal distribution it would be equivalent toc(.001, .999)
since the normal distribution is defined on(-Inf, Inf)
.- adjust
Passed to
density
: the bandwidth for the density estimator for sample data is adjusted by multiplying it by this value. See e.g.density_bounded()
for more information. Default (waiver()
) defers to the default of the density estimator, which is usually1
.- trim
For sample data, should the density estimate be trimmed to the range of the data? Passed on to the density estimator; see the
density
parameter. DefaultTRUE
.- expand
For sample data, should the slab be expanded to the limits of the scale? Default
FALSE
. Can be length two to control expansion to the lower and upper limit respectively.- breaks
Determines the breakpoints defining bins. Defaults to
"Scott"
. Similar to (but not exactly the same as) thebreaks
argument tographics::hist()
. One of:A scalar (length-1) numeric giving the number of bins
A vector numeric giving the breakpoints between histogram bins
A function taking
x
andweights
and returning either the number of bins or a vector of breakpointsA string giving the suffix of a function that starts with
"breaks_"
. ggdist provides weighted implementations of the"Sturges"
,"Scott"
, and"FD"
break-finding algorithms fromgraphics::hist()
, as well asbreaks_fixed()
for manually setting the bin width. See breaks.
For example,
breaks = "Sturges"
will use thebreaks_Sturges()
algorithm,breaks = 9
will create 9 bins, andbreaks = breaks_fixed(width = 1)
will set the bin width to1
.- align
Determines how to align the breakpoints defining bins. Default (
"none"
) performs no alignment. One of:A scalar (length-1) numeric giving an offset that is subtracted from the breaks. The offset must be between
0
and the bin width.A function taking a sorted vector of
breaks
(bin edges) and returning an offset to subtract from the breaks.A string giving the suffix of a function that starts with
"align_"
used to determine the alignment, such asalign_none()
,align_boundary()
, oralign_center()
.
For example,
align = "none"
will provide no alignment,align = align_center(at = 0)
will center a bin on0
, andalign = align_boundary(at = 0)
will align a bin edge on0
.- outline_bars
For sample data (if
density
is"histogram"
) and for discrete analytical distributions (whose slabs are drawn as histograms), determines if outlines in between the bars are drawn when theslab_color
aesthetic is used. IfFALSE
(the default), the outline is drawn only along the tops of the bars; ifTRUE
, outlines in between bars are also drawn. Seedensity_histogram()
.- point_interval
A function from the
point_interval()
family (e.g.,median_qi
,mean_qi
,mode_hdi
, etc), or a string giving the name of a function from that family (e.g.,"median_qi"
,"mean_qi"
,"mode_hdi"
, etc; if a string, the caller's environment is searched for the function, followed by the ggdist environment). This function determines the point summary (typically mean, median, or mode) and interval type (quantile interval,qi
; highest-density interval,hdi
; or highest-density continuous interval,hdci
). Output will be converted to the appropriatex
- ory
-based aesthetics depending on the value oforientation
. See thepoint_interval()
family of functions for more information.- slab_type
(deprecated) The type of slab function to calculate: probability density (or mass) function (
"pdf"
), cumulative distribution function ("cdf"
), or complementary CDF ("ccdf"
). Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.- limits
Manually-specified limits for the slab, as a vector of length two. These limits are combined with those computed based on
p_limits
as well as the limits defined by the scales of the plot to determine the limits used to draw the slab functions: these limits specify the maximal limits; i.e., if specified, the limits will not be wider than these (but may be narrower). UseNA
to leave a limit alone; e.g.limits = c(0, NA)
will ensure that the lower limit does not go below 0, but let the upper limit be determined by eitherp_limits
or the scale settings.- n
Number of points at which to evaluate the function that defines the slab.
- .width
The
.width
argument passed topoint_interval
: a vector of probabilities to use that determine the widths of the resulting intervals. If multiple probabilities are provided, multiple intervals per group are generated, each with a different probability interval (and value of the corresponding.width
andlevel
generated variables).- orientation
Whether this geom is drawn horizontally or vertically. One of:
NA
(default): automatically detect the orientation based on how the aesthetics are assigned. Automatic detection works most of the time."horizontal"
(or"y"
): draw horizontally, using they
aesthetic to identify different groups. For each group, uses thex
,xmin
,xmax
, andthickness
aesthetics to draw points, intervals, and slabs."vertical"
(or"x"
): draw vertically, using thex
aesthetic to identify different groups. For each group, uses they
,ymin
,ymax
, andthickness
aesthetics to draw points, intervals, and slabs.
For compatibility with the base ggplot naming scheme for
orientation
,"x"
can be used as an alias for"vertical"
and"y"
as an alias for"horizontal"
(ggdist had anorientation
parameter before base ggplot did, hence the discrepancy).- na.rm
If
FALSE
, the default, missing values are removed with a warning. IfTRUE
, missing values are silently removed.- show.legend
Should this layer be included in the legends? Default is
c(size = FALSE)
, unlike most geoms, to match its common use cases.FALSE
hides all legends,TRUE
shows all legends, andNA
shows only those that are mapped (the default for most geoms).- inherit.aes
If
FALSE
, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g.borders()
.
Value
A ggplot2::Stat representing a histogram + interval geometry which can
be added to a ggplot()
object.
Details
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame.dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
xmin
orymin
: For intervals, the lower end of the interval from the interval function.xmax
orymax
: For intervals, the upper end of the interval from the interval function..width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab.level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab.pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval.cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval.n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
.f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic.
Aesthetics
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.justification
: Justification of the interval relative to the slab, where0
indicates bottom/left justification and1
indicates top/right justification (depending onorientation
). Ifjustification
isNULL
(the default), then it is set automatically based on the value ofside
: whenside
is"top"
/"right"
justification
is set to0
, whenside
is"bottom"
/"left"
justification
is set to1
, and whenside
is"both"
justification
is set to 0.5.datatype
: When using composite geoms directly without astat
(e.g.geom_slabinterval()
),datatype
is used to indicate which part of the geom a row in the data targets: rows withdatatype = "slab"
target the slab portion of the geometry and rows withdatatype = "interval"
target the interval portion of the geometry. This is set automatically when using ggdiststat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (iforientation = "horizontal"
).xmax
: Right end of the interval sub-geometry (iforientation = "horizontal"
).ymin
: Lower end of the interval sub-geometry (iforientation = "vertical"
).ymax
: Upper end of the interval sub-geometry (iforientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (orcolor
) The color of the interval and point sub-geometries. Use theslab_color
,interval_color
, orpoint_color
aesthetics (below) to set sub-geometry colors separately.fill
: The fill color of the slab and point sub-geometries. Use theslab_fill
orpoint_fill
aesthetics (below) to set sub-geometry colors separately.alpha
: The opacity of the slab, interval, and point sub-geometries. Use theslab_alpha
,interval_alpha
, orpoint_alpha
aesthetics (below) to set sub-geometry colors separately.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except withgeom_slab()
: then it is the width of the slab). With composite geometries including an interval and slab, useslab_linewidth
to set the line width of the slab (see below). For interval, rawlinewidth
values are transformed according to theinterval_size_domain
andinterval_size_range
parameters of thegeom
(see above).size
: Determines the size of the point. Iflinewidth
is not provided,size
will also determines the width of the line used to draw the interval (this allows line width and point size to be modified together by setting onlysize
and notlinewidth
). Rawsize
values are transformed according to theinterval_size_domain
,interval_size_range
, andfatten_point
parameters of thegeom
(see above). Use thepoint_size
aesthetic (below) to set sub-geometry size directly without applying the effects ofinterval_size_domain
,interval_size_range
, andfatten_point
.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the interval and the outline of the slab (if it is visible). Use theslab_linetype
orinterval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override forfill
: the fill color of the slab.slab_colour
: (orslab_color
) Override forcolour
/color
: the outline color of the slab.slab_alpha
: Override foralpha
: the opacity of the slab.slab_linewidth
: Override forlinwidth
: the width of the outline of the slab.slab_linetype
: Override forlinetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (orinterval_color
) Override forcolour
/color
: the color of the interval.interval_alpha
: Override foralpha
: the opacity of the interval.interval_linetype
: Override forlinetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override forfill
: the fill color of the point.point_colour
: (orpoint_color
) Override forcolour
/color
: the outline color of the point.point_alpha
: Override foralpha
: the opacity of the point.point_size
: Override forsize
: the size of the point.
Deprecated aesthetics
slab_size
: Useslab_linewidth
.interval_size
: Useinterval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See also
See geom_slabinterval()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
,
stat_spike()
Examples
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c"),
value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
ggplot(aes(x = value, y = group)) +
stat_histinterval()
# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
group = c("a", "b", "c"),
mean = c( 5, 7, 8),
sd = c( 1, 1.5, 1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
stat_histinterval()