floor-votes-by-year

Created by: SmirkyGraphs. Code: Github. Source: RI Legislature Site.


Rhode Island House and Senate Floor Votes by Year

The final days of a session can get a bit crazy for legislatures. The last day of 2016 saw a whopping 212 votes for the House and 127 for the Senate. This notebook will take a look at how many votes occur each year, when the session ends and how many votes occur in the final days of a session compared to the year as a whole for both the House and Senate.


Number of House Votes and Sessions by Year

In [1]:
import pandas as pd
import numpy as np
In [2]:
df = pd.read_csv('../data/clean/house_vote_info_clean.csv')
In [3]:
# adding cols to make data prep easier
df['count'] = 1
df['session'] = df['id'].str[2:6]

# remove roll call as it wasnt present in earlier years
df = df[df['vote_category'] != 'ROLL CALL']

# dates to remove (a tour and a possible test?)
df = df[df['bill_date'] != '2003-11-12']
df = df[df['bill_date'] != '2008-11-25']
In [4]:
# getting count of total votes for each individual session day
h_df = df.groupby(['session', 'bill_date'])['count'].count().reset_index()

# getting running total of floor votes for each session
h_df['votes_total'] = h_df.groupby(h_df['session'].ne(h_df['session'].shift()).cumsum()).cumsum()

# getting a running total of number of days for each session
h_df['day_id'] = h_df.groupby(h_df['session'].ne(h_df['session'].shift()).cumsum()).cumcount().add(1)
In [5]:
# previewing the table
h_df.head(2)
Out[5]:
session bill_date count votes_total day_id
0 2003 2003-01-08 7 7 1
1 2003 2003-01-09 1 8 2
In [6]:
# day with most house votes
h_df[h_df['count'] == h_df['count'].max()]
Out[6]:
session bill_date count votes_total day_id
801 2016 2016-06-18 212 920 59
In [7]:
# getting table of final days for each session
house_votes = h_df.groupby('session').last()

# add fitting headers and removing index label
cols = ['last_day', 'last_day_votes', 'total_votes', 'total_sessions']
house_votes.columns = cols
house_votes.columns.name = house_votes.index.name
house_votes.index.name = None
In [8]:
house_votes
Out[8]:
session last_day last_day_votes total_votes total_sessions
2003 2003-07-15 36 948 68
2004 2004-07-30 15 1140 56
2005 2005-07-15 13 1053 57
2006 2006-06-24 82 1158 59
2007 2007-10-30 40 1122 54
2008 2008-06-21 188 955 55
2009 2009-10-30 6 831 57
2010 2010-06-11 43 749 47
2011 2011-11-17 29 805 57
2012 2012-06-13 26 870 59
2013 2013-07-03 87 984 60
2014 2014-06-21 31 981 57
2015 2015-06-25 24 645 57
2016 2016-06-18 212 920 59
2017 2017-09-19 157 880 61
2018 2018-06-23 102 716 50
2019 2019-06-28 105 624 65

[House] Total Votes Final 10 Days vs. All Other Days

In [9]:
# getting count of total votes for each individual session day
end_df = df.groupby(['session', 'bill_date'])['count'].count().reset_index()
In [10]:
# getting running total of floor votes for each session
end_df['votes_total'] = end_df.groupby(end_df['session'].ne(end_df['session'].shift()).cumsum()).cumsum()

# getting a REVERSED running total of number of days for each session
end_df['day_id'] = end_df.groupby(end_df['session'].ne(end_df['session'].shift()).cumsum()).cumcount(ascending=False).add(1)
In [11]:
end_df.loc[end_df['day_id'] <= 10, 'final_10_days'] = 'final_10_days'
end_df['final_10_days'] = end_df['final_10_days'].fillna('other_days')
In [12]:
end_df = end_df.groupby(['session', 'final_10_days'])['count'].sum().reset_index()
end_df = end_df.pivot_table(columns='final_10_days', index='session', values='count')

end_df['total'] = end_df['final_10_days'] + end_df['other_days']
end_df['%_final_10'] = end_df['final_10_days']/end_df['total']

# removing total and cleaning % format
house_end_df = end_df.drop(columns='total')
house_end_df['%_final_10'] = house_end_df['%_final_10'].map(lambda x: '{:,.1%}'.format(x))

# add fitting headers and removing index label
house_end_df.columns.name = house_end_df.index.name
house_end_df.index.name = None
In [13]:
house_end_df
Out[13]:
session final_10_days other_days %_final_10
2003 545 403 57.5%
2004 599 541 52.5%
2005 572 481 54.3%
2006 642 516 55.4%
2007 599 523 53.4%
2008 596 359 62.4%
2009 485 346 58.4%
2010 491 258 65.6%
2011 401 404 49.8%
2012 518 352 59.5%
2013 556 428 56.5%
2014 589 392 60.0%
2015 386 259 59.8%
2016 620 300 67.4%
2017 564 316 64.1%
2018 481 235 67.2%
2019 387 237 62.0%

Number of Senate Votes and Sessions by Year

In [14]:
df = pd.read_csv('../data/clean/senate_vote_info_clean.csv')
In [15]:
# adding cols to make data prep easier
df['count'] = 1
df['session'] = df['id'].str[2:6]

# remove roll call for normalization (did so for house)
df = df[df['vote_category'] != 'ROLL CALL']
In [16]:
# getting count of total votes for each individual session day
s_df = df.groupby(['session', 'bill_date'])['count'].count().reset_index()

# getting running total of floor votes for each session
s_df['votes_total'] = s_df.groupby(s_df['session'].ne(s_df['session'].shift()).cumsum()).cumsum()

# getting a running total of number of days for each session
s_df['day_id'] = s_df.groupby(s_df['session'].ne(s_df['session'].shift()).cumsum()).cumcount().add(1)
In [17]:
# previewing the table
s_df.head(2)
Out[17]:
session bill_date count votes_total day_id
0 2016 2016-01-06 1 1 1
1 2016 2016-01-20 1 2 2
In [18]:
# day with most senate votes
s_df[s_df['count'] == s_df['count'].max()]
Out[18]:
session bill_date count votes_total day_id
57 2016 2016-06-18 127 722 58
In [19]:
# getting table of final days for each session
senate_votes = s_df.groupby('session').last()

# add fitting headers and removing index label
cols = ['last_day', 'last_day_votes', 'total_votes', 'total_sessions']
senate_votes.columns = cols
senate_votes.columns.name = senate_votes.index.name
senate_votes.index.name = None
In [20]:
senate_votes
Out[20]:
session last_day last_day_votes total_votes total_sessions
2016 2016-06-18 127 722 58
2017 2017-09-19 85 575 63
2018 2018-06-23 124 524 53
2019 2019-06-28 36 572 61

[Senate] Total Votes Final 10 Days vs. All Other Days

In [21]:
# getting count of total votes for each individual session day
end_df = df.groupby(['session', 'bill_date'])['count'].count().reset_index()
In [22]:
# getting running total of floor votes for each session
end_df['votes_total'] = end_df.groupby(end_df['session'].ne(end_df['session'].shift()).cumsum()).cumsum()

# getting a REVERSED running total of number of days for each session
end_df['day_id'] = end_df.groupby(end_df['session'].ne(end_df['session'].shift()).cumsum()).cumcount(ascending=False).add(1)
In [23]:
end_df.loc[end_df['day_id'] <= 10, 'final_10_days'] = 'final_10_days'
end_df['final_10_days'] = end_df['final_10_days'].fillna('other_days')
In [24]:
end_df = end_df.groupby(['session', 'final_10_days'])['count'].sum().reset_index()
end_df = end_df.pivot_table(columns='final_10_days', index='session', values='count')

end_df['total'] = end_df['final_10_days'] + end_df['other_days']
end_df['%_final_10'] = end_df['final_10_days']/end_df['total']

# removing total and cleaning % format
senate_end_df = end_df.drop(columns='total')
senate_end_df['%_final_10'] = senate_end_df['%_final_10'].map(lambda x: '{:,.1%}'.format(x))

# add fitting headers and removing index label
senate_end_df.columns.name = senate_end_df.index.name
senate_end_df.index.name = None
In [25]:
senate_end_df
Out[25]:
session final_10_days other_days %_final_10
2016 404 318 56.0%
2017 316 259 55.0%
2018 304 220 58.0%
2019 335 237 58.6%

Creating Export Report for Tableau

In [26]:
# adding chamber label to house sets
house_votes['chamber'] = 'house'
house_end_df['chamber'] = 'house'

# adding chamber label to senate sets
senate_votes['chamber'] = 'senate'
senate_end_df['chamber'] = 'senate'
In [27]:
final_10_days = pd.concat([house_end_df, senate_end_df])
total_floor_votes = pd.concat([house_votes, senate_votes])

final_10_days.index.name = 'session'
total_floor_votes.index.name = 'session'

final_10_days = final_10_days.reset_index()
total_floor_votes = total_floor_votes.reset_index()

final_10_days.to_csv('../data/reports/total-floor-votes/final_10_days.csv', index=False)
total_floor_votes.to_csv('../data/reports/total-floor-votes/total_floor_votes.csv', index=False)