Stata 19 Features & Capabilities

Linear models

regression • censored outcomes • endogenous regressors • bootstrap, jackknife, and robust and cluster–robust variance • wild cluster bootstrap • instrumental variables • three-stage least squares • constraints • quantile regression • GLS • DID • more

Time series

ARIMA • ARFIMA • ARCH/GARCH • VAR • VECM • multivariate GARCH • unobserved-components model • dynamic factors • state-space models • Markov-switching models • business calendars • tests for structural breaks • threshold regression • forecasts • impulse–response functions • local projections • unit-root tests • filters and smoothers • rolling and recursive estimation GLS • Bayesian • more

Data manipulation

data transformations • data frames • match-merge • import/export data • JDBC • ODBC • SQL • Unicode • by-group processing • append files • sort • row–column transposition • labelling • save results • more

Panel/longitudinal data

random and fixed effects with robust standard errors • linear mixed models • random-effects probit • GEE • random- and fixed-effects Poisson • dynamic panel-data models • instrumental variables • DID • panel unit-root tests • more

Survival analysis

Kaplan–Meier and Nelson–Aalen estimators, • Cox regression (frailty) • parametric models (frailty, random effects) • competing risks • hazards • time-varying covariates • left-, right-, and interval-censoring • Weibull, exponential, and Gompertz models • more

Reporting

reproducible reports • customisable tables • graphical tables builder • Word • Excel • PDF • HTML • dynamic documents • Markdown • Stata results and graphs • SVG • EPS • PNG • TIF • more

Multilevel mixed-effects models

continuous, binary, count, and survival outcomes • two-, three-, and higher-level models • generalized linear models • nonlinear models • random intercepts • random slopes • crossed random effects • BLUPs of effects and fitted values • hierarchical models • residual error structures • DDF adjustments • support for survey data • more

Bayesian analysis

thousands of built-in models • univariate and multivariate models • linear and nonlinear models • panel data • multilevel models • VAR • DGSE • continuous, binary, ordinal, and count outcomes • bayes: prefix for 58 estimation commands • continuous univariate, multivariate, and discrete priors • add your own models • multiple chains • convergence diagnostics • posterior summaries • hypothesis testing • model fit • model comparison • predictions • dynamic forecast • impulse-response functions • more

Bayesian model averaging

full enumeration • MC3 and MH sampling • three model prior classes • fixed and random g-priors for coefficients • heredity rules • PIP for predictor • model ranking by PMP • BMA convergence • variable-inclusion maps • model-size distribution plots • jointness measures • log predictive-score • predictions • more

Graphics

lines • bars • areas • ranges • contours • confidence intervals • interaction plots • survival plots • publication quality • customise anything • Graph Editor • more

Binary, count, and limited outcomes

logistic, probit, tobit • Poisson and negative binomial • conditional, multinomial, nested, ordered, rank-ordered, and stereotype logistic • multinomial probit • zero-inflated and left-truncated count models • selection models • marginal effects • more

Meta-analysis

effect sizes • common, fixed, and random effects • forest, funnel, and more plots • subgroup and cumulative analysis • leave-one-out • meta-regression • small-study effects • publication bias • multivariate • multilevel • more

Programming features

adding new commands • scripting • object-oriented programming • menu and dialog-box programming • dynamic documents • Markdown • Project Manager • Python integration • PyStata • Jupyter notebook • Java integration • Java plugins • H2O access • C/C++ plugins • more

Choice models

discrete choice • rank-ordered alternatives • conditional logit • multinomial probit • nested logit • mixed logit • panel data • case-specific and alternative-specific predictors • interpret results—expected probabilities, covariate effects, comparisons across alternatives • more

Power, precision, and sample size

power • sample size • effect size • minimum detectable effect • CI width • means • proportions • variances • correlations • ANOVA • regression • cluster randomized designs • case–control studies • cohort studies • contingency tables • survival analysis • balanced or unbalanced designs • results in tables or graphs • group sequential designs for clinical trials • more

Mata—Stata's serious programming language

interactive sessions • large-scale development projects • optimization • matrix inversions • decompositions • eigenvalues and eigenvectors • LAPACK engine • Intel® MKL • real and complex numbers • string matrices • interface to Stata datasets and matrices • numerical derivatives • object-oriented programming • more

Extended regression models (ERMs)

endogenous covariates • sample selection • nonrandom treatment • panel data • account for problems alone or in combination • continuous, interval-censored, binary, and ordinal outcomes • more

Causal inference / Treatment effects

inverse probability weight (IPW) • doubly robust methods • propensity-score matching • regression adjustment • covariate matching • DID • multilevel treatments • endogenous treatments • average treatment effects (ATEs) • ATEs on the treated (ATETs) • potential-outcome means (POMs) • continuous, binary, count, fractional, and survival outcomes • panel data • lasso • causal mediation analysis • more

Graphical user interface

menus and dialogs for all features • Data Editor • Variables Manager • Graph Editor • Project Manager • Do-file Editor • multiple preference sets • more

Generalized linear models (GLMs)

ten link functions • user-defined links • seven distributions • ML and IRLS estimation • nine variance estimators • seven residuals •more

Lasso

lasso • elastic net • model selection • prediction • inference • continuous, binary, and count outcomes • cross-validation • adaptive lasso • double selection • partialing out • cross-fit partialing out • double machine learning • endogenous covariates • treatment effects • more

Documentation

36 manuals • 19,000+ pages • seamless navigation • thousands of worked examples • quick starts • methods and formulas • references • more

Finite mixture models (FMMs)

fmm: prefix for 17 estimators • mixtures of a single estimator • mixtures combining multiple estimators or distributions • continuous, binary, count, ordinal, categorical, censored, truncated, and survival outcomes • more

SEM (structural equation modelling)

graphical path diagram builder • standardized and unstandardized estimates • modification indices • direct and indirect effects • continuous, binary, count, ordinal, and survival outcomes • multilevel models • random slopes and intercepts • factor scores, empirical Bayes, and other predictions • groups and tests of invariance • goodness of fit • handles MAR data by FIML • correlated data • survey data • more

Basic statistics

summaries • cross-tabulations • correlations • z and t tests • equality-of-variance tests • tests of proportions • confidence intervals • factor variables • more

Spatial autoregressive models

spatial lags of dependent variable, independent variables, and autoregressive errors • fixed and random effects in panel data • endogenous covariates • analyse spillover effects • more

Latent class analysis

binary, ordinal, continuous, count, categorical, fractional, and survival items • add covariates to model class membership • combine with SEM path models • expected class proportions • goodness of fit • predictions of class membership • more

Nonparametric methods

nonparametric regression • Wilcoxon–Mann–Whitney, Wilcoxon signed ranks, and Kruskal–Wallis tests • Cochran-Armitage and other trend tests • Spearman and Kendall correlations • Kolmogorov–Smirnov tests • exact binomial CIs • survival data • ROC analysis • smoothing • bootstrapping • more

ANOVA/MANOVA

balanced and unbalanced designs • factorial, nested, and mixed designs • repeated measures • marginal means • contrasts • more

Multiple imputation

nine univariate imputation methods • multivariate normal imputation • chained equations • explore pattern of missingness • manage imputed datasets • fit model and pool results • transform parameters • joint tests of parameter estimates • predictions • more

Nonlinear regression, GMM and other systems of equations

generalized method of moments (GMM) • nonlinear regression • demand systems • more

Exact statistics

exact logistic and Poisson regression • exact case–control statistics • binomial tests • Fisher’s exact test for r × c tables • more

Survey methods

multistage designs • bootstrap, BRR, jackknife, linearized, and SDR variance estimation • poststratification • raking • calibration • DEFF • predictive margins • means, proportions, ratios, totals • summary tables • almost all estimators supported • more

Simple maximum likelihood

specify likelihood using simple expressions • no programming required • survey data • standard, robust, bootstrap, and jackknife SEs • matrix estimators • more

Epidemiology

standardization of rates • case–control • cohort • matched case–control • Mantel–Haenszel • pharmacokinetics • ROC analysis • ICD-10 • additive models of risk • more

Cluster analysis

hierarchical clustering • kmeans and kmedian nonhierarchical clustering • dendrograms • stopping rules • user-extensible analyses • more

Network analysis

nwcommands: import and manipulate networks • generate networks • calculate centrality and dissimilarity measures • visualise networks • more

Programmable maximum likelihood

user-specified functions • NR, DFP, BFGS, BHHH • OIM, OPG, robust, bootstrap, and jackknife SEs • Wald tests • survey data • numeric or analytic derivatives • more

Survey methods

Simple maximum likelihood

specify likelihood using simple expressions • no programming required • survey data • standard, robust, bootstrap, and jackknife SEs • matrix estimators • more

DSGE models

specify models algebraically • solve models • estimate parameters • identification diagnostics • policy and transition matrices • IRFs • dynamic forecasts • Bayesian • more

IRT (item response theory)

binary (1PL, 2PL, 3PL), ordinal, and categorical response models • item characteristic curves • test characteristic curves • item information functions • test information functions • multiple-group models • differential item functioning (DIF) • more

Other statistical methods

kappa measure of interrater agreement • Cronbach's alpha • stepwise regression • tests of normality • more

Tests, predictions, and effects

Wald tests • LR tests • linear and nonlinear combinations • predictions and generalized predictions • marginal means • least-squares means • adjusted means • marginal and partial effects • forecast models • Hausman tests • more

Multivariate methods

factor analysis • principal components • discriminant analysis • rotation • multidimensional scaling • Procrustean analysis • correspondence analysis • biplots • dendrograms • user-extensible analyses •more

Functions

statistical • random-number • mathematical • string • date and time • regular expressions • Unicode • more

Contrasts, pairwise comparisons, and margins

compare means, intercepts, or slopes • compare with reference category, adjacent category, grand mean, etc. • orthogonal polynomials • multiple-comparison adjustments • graph estimated means and contrasts • interaction plots • more

Internet capabilities

search and download thousands of community-contributed features • web updating • web file sharing • latest Stata news • more

Resampling and simulation methods

bootstrap • jackknife • Monte Carlo simulation • permutation tests • exact p-values • more

Community-contributed commands

search and download thousands of free additions • discover new features in the Stata Journal • share commands by posting to the SSC • discuss community-contributed commands on Statalist • more

Installation Qualification

IQ report for regulatory agencies such as the FDA • installation verification • more

FDA Compliance

Adherence to FDA regulatory requirement for statistical software • more

Accessibility

Section 508 compliance, accessibility for persons with disabilities • more

New in Stata 19

machine learning via H2O • conditional average treatment effects • high-dimensional fixed effects • bayesian variable selection for linear regression • marginal Cox PH models for interval-censored multiple events data • meta-analysis for correlations • correlated random-effects model • panel-data vector autoregressive model • more

Stata Features

Linear models

Time series

Data manipulation

Panel/longitudinal data

Survival analysis

Reporting

Multilevel mixed-effects models

Bayesian analysis

Bayesian model averaging

Graphics

Binary, count, and limited outcomes

Meta-analysis

Programming features

Choice models

Power, precision, and sample size

Mata—Stata's serious programming language

Extended regression models (ERMs)

Causal inference / Treatment effects

Graphical user interface

Generalized linear models (GLMs)

Lasso

Documentation

Finite mixture models (FMMs)

SEM (structural equation modelling)

Basic statistics

Spatial autoregressive models

Latent class analysis

Nonparametric methods

ANOVA/MANOVA

Multiple imputation

Nonlinear regression, GMM and other systems of equations

Exact statistics

Survey methods

Simple maximum likelihood

Epidemiology

Cluster analysis

Network analysis

Programmable maximum likelihood

Survey methods

Simple maximum likelihood

DSGE models

IRT (item response theory)

Other statistical methods

Tests, predictions, and effects

Multivariate methods

Functions

Contrasts, pairwise comparisons, and margins

Internet capabilities

Resampling and simulation methods

Community-contributed commands

Installation Qualification

FDA Compliance

Accessibility

New in Stata 19

Linear models

Time series

Data wrangling/data management

Panel / Longitudinal data

Survival analysis

Reporting

Multilevel mixed-effects models

Bayesian

Graphics

Binary, fractional, count, and limited outcomes

Meta-analysis

Programming features

Choice models

Power, precision, and sample size

Mata—Stata's serious programming language

Extended regression models (ERMs)

Treatment effects / Causal inference

Graphical user interface

Generalized linear models

Lasso

Documentation

Finite mixture models (FMMs)

Structural equation modelling (SEM)

Basic statistics

Spatial autoregressive models