You may also wish to recode or replace missing values see below for more details on those operations. Examples:ĭrop cases missing string data (for variable "important_string_variable")ĭrop cases missing numeric data (for variable "important_numeric_variable")ĭrop cases missing data (string or numeric, for variable "important_either_kind_of_variable")ĭrop if missing(important_either_kind_of_variable) Use Stata's drop command, combined with a logical / conditional statement, to drop missing values. ( More on findit and installing packages) These packages do not come with Stata, but can be downloaded by typing findit mdesc at the Stata command line. Additional resources you can use to investigate missing values are the packages mdesc, mvpatters, and misschk. The command summarize will list how many missing values you have. When you load data into Stata, you will likely look at descriptive statistics or some other data summary. Regression - if an observation is missing data for a variable in the regression model, that observation is excluded from the regression (listwise deletion of missing data).Correlations - calculated on pairs with non-missing data by default (pairwise deletion of missing data) use pwcorr for listwise deletion of missing data.Tabulate - missing values excluded by default use missing option within tab to include missing values.Summarize - uses only non-missing values.Some common procedures are below for others, check the Stata documentation. Missing data values will affect how Stata handles your data. If you are working with string variables, the data will appear as. In Stata, if your variable is numeric and you are missing data, you will see. MPG = 0 is very different from MPG = "I'm not sure."ĭifferent statistical software code missing data differently. (This may seem obvious, but I have had many students nonchalantly say "oh, so we can just replace those with zeros." Nope.) Consider this in the context of gas mileage. Check your metadata/codebook to make sure you know what you are working with!) For numeric data, keep in mind that missing data are not the same as a value of zero. (Some datasets have these three cases coded differently others lump them together. In survey data, missing values may mean that the surveyor did not ask the question, that the respondent did not answer the question, or that the data are truly missing. Note: When working with missing data, you need to consider why that data is missing.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |