- Read in the crime hdf5 dataset, set the index as REPORTED_DATE, and then sort it to increase performance for the rest of the recipe:
>>> crime_sort = pd.read_hdf('data/crime.h5', 'crime') \ .set_index('REPORTED_DATE') \ .sort_index()
- Use the resample method to group by each quarter of the year and then sum the IS_CRIME and IS_TRAFFIC columns for each group:
>>> crime_quarterly = crime_sort.resample('Q')['IS_CRIME', 'IS_TRAFFIC'].sum()>>> crime_quarterly.head()
- Notice that the dates all appear as the last day of the quarter. This is because the offset alias, Q, represents the end of the quarter. Let's use the offset alias ...