This section involves various data cleaning tasks for the Nashvillahousing table.
-
Standardize Date Format: Converts the SaleDate column to a standardized date format using CAST.
-
Populate Property Address Data: Fills in missing PropertyAddress values by joining the table with itself based on
ParcelIDand copying non-null addresses. -
Breaking out PropertyAddress: Splits the
PropertyAddresscolumn intoAddressandCityusingSUBSTRINGandCHARINDEX, and stores them in new columns. -
Breaking out OwnerAddress: Similar to above, splits
OwnerAddressinto individual components (OwnerAddress,OwnerCity,OwnerState) usingPARSENAMEandREPLACE, and stores them in new columns. -
Change Y and N to Yes and No: Converts values in the
SoldAsVacantcolumn from 'Y' to 'Yes' and 'N' to 'No'. -
Remove Duplicates: Utilizes a Common Table Expression (CTE) with
ROW_NUMBER()to identify and remove duplicate rows based on certain columns. -
Delete Unused Columns: Drops specified columns (
OwnerAddress,TaxDistrict,PropertyAddress,SaleDate) from the table.
Repository created by [Diyan].