Script 219: SBA pacing check

Purpose:

The Python script evaluates the pacing of campaigns against their structured budget allocation targets by analyzing conversion data.

To Elaborate

The script is designed to assess whether marketing campaigns are on track to meet their structured budget allocation (SBA) targets. It processes input data to calculate conversion metrics and uses these to predict future performance. The script filters data from the start of the current month and groups it by SBA bucket names to sum up organic and PPC conversions. It then calculates the median of recent conversions and projects these figures over the remaining days of the month. The script also evaluates recent paid conversions and costs to determine if campaigns are under or on pace to meet their targets. It flags campaigns with pacing issues and provides a predicted total conversion figure, helping stakeholders make informed decisions about budget adjustments.

Walking Through the Code

  1. Data Preparation:
    • The script begins by converting the SBA bucket end date to a datetime format and filters the input data to include only records from the start of the current month.
  2. Grouping and Aggregation:
    • It groups the filtered data by SBA bucket names and calculates the sum of organic and PPC conversions for each group.
    • The total conversions are computed by summing organic and PPC conversions.
  3. Recent Conversion Analysis:
    • The script identifies the most recent week of data and calculates the median of organic conversions for this period.
    • It projects this median over the remaining days of the month to estimate future conversions.
  4. Mean Calculation for Paid Conversions:
    • The mean of PPC conversions over the recent week is calculated and projected over the remaining days of the month.
  5. Data Merging:
    • The script merges the aggregated conversion data with the projected median and mean values to form a comprehensive dataset.
  6. Campaign Grouping:
    • It duplicates rows for each campaign within the SBA bucket and combines this with the comprehensive dataset.
  7. Recent Paid Conversion Analysis:
    • The script calculates the sum of paid conversions and costs over the last five days for each campaign.
  8. Pacing Evaluation:
    • It evaluates whether campaigns are under or on pace by comparing predicted totals against SBA targets and recent paid conversion data.
    • Flags are set for campaigns based on their pacing status, with detailed messages indicating predicted totals and pacing issues.
  9. Output Preparation:
    • The final output dataframe is prepared with pacing check results and printed for review.

Vitals

  • Script ID : 219
  • Client ID / Customer ID: 1306923673 / 60269245
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, SBA_Pacing_Check
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Jonathan Reichl (jreichl@marinsoftware.com)
  • Created by Jonathan Reichl on 2023-06-21 13:20
  • Last Updated by Jonathan Reichl on 2023-12-06 04:01
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_PPC_UNIQUEVISITORSCONV = 'PPC Unique Visitors Conv.'
RPT_COL_ORGANIC_UNIQUEVISITORSCONV = 'Organic Unique Visitors Conv.'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_SBA_BUCKETNAME = 'SBA Bucket Name'
RPT_COL_SBA_BUDGETMODELTARGET = 'SBA Budget Model Target'
RPT_COL_DATE = 'Date'
RPT_COL_SBA_BUCKET_END_DATE = 'SBA Bucket End Date'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_SBA_PACING_CHECK = 'SBA_Pacing_Check'

outputDf[BULK_COL_SBA_PACING_CHECK] = "<<YOUR VALUE>>"

#today = datetime.datetime.now(CLIENT_TIMEZONE).date()
today = pd.Timestamp.now(CLIENT_TIMEZONE).date()

#print(tableize(inputDf))

inputDf[RPT_COL_SBA_BUCKET_END_DATE] = pd.to_datetime(inputDf[RPT_COL_SBA_BUCKET_END_DATE])


start_of_month = pd.Timestamp(datetime .date(today.year, today.month, 1))
#filtered_df = inputDf[(inputDf[RPT_COL_DATE] >= start_of_month) & (inputDf[RPT_COL_SBA_BUCKET_END_DATE] >= start_of_month)]
filtered_df = inputDf[(inputDf[RPT_COL_DATE] >= start_of_month) ]


# Step 2: Group the filtered dataframe by 'campaign_group' and calculate the sum of RPT_COL_ORGANIC_UNIQUEVISITORSCONV and RPT_COL_PPC_UNIQUEVISITORSCONV for each group
grouped_df = filtered_df.groupby([RPT_COL_SBA_BUCKETNAME]).agg({RPT_COL_ORGANIC_UNIQUEVISITORSCONV: 'sum', RPT_COL_PPC_UNIQUEVISITORSCONV: 'sum'})

# Step 3: Calculate the total conversions (sum of RPT_COL_ORGANIC_UNIQUEVISITORSCONV and RPT_COL_PPC_UNIQUEVISITORSCONV) for each group
grouped_df['total_conversions'] = grouped_df[RPT_COL_ORGANIC_UNIQUEVISITORSCONV] + grouped_df[RPT_COL_PPC_UNIQUEVISITORSCONV]

# Step 4: Calculate the median of the most recent 7 days of conversions for each campaign group
most_recent_week = pd.Timestamp(today - datetime.timedelta(days=7))
recent_week_df = filtered_df[filtered_df[RPT_COL_DATE] >= most_recent_week]
print('recent_week df print')
print(tableize(recent_week_df))

#### swapped the 2 rows bwlow 1010
#median_recent_week = recent_week_df.groupby([RPT_COL_SBA_BUCKETNAME]).agg({RPT_COL_ORGANIC_UNIQUEVISITORSCONV: 'median'})
recent_week_df = recent_week_df[recent_week_df[RPT_COL_ORGANIC_UNIQUEVISITORSCONV] != 0]
median_recent_week = recent_week_df.groupby([RPT_COL_SBA_BUCKETNAME], as_index=False).agg({RPT_COL_ORGANIC_UNIQUEVISITORSCONV: 'median'})
print ('median_recent_week')
print(tableize(median_recent_week))

# Step 5: Calculate the median * days remaining in the month
if today.month == 12:
    days_remaining = (datetime.date(today.year + 1, 1, 1) - today).days
else:
    days_remaining = (datetime.date(today.year, today.month + 1, 1) - today).days
#median_times_remaining = median_recent_week * days_remaining
median_times_remaining = median_recent_week.copy()  # Create a copy of median_recent_week
median_times_remaining[RPT_COL_ORGANIC_UNIQUEVISITORSCONV] = median_times_remaining[RPT_COL_ORGANIC_UNIQUEVISITORSCONV] * days_remaining

#print('organic median')
#print(median_times_remaining.columns)
#print (tableize(median_times_remaining))
# Step 6: Merge the results into the new dataframe
new_df = pd.merge(grouped_df, median_times_remaining, on=[RPT_COL_SBA_BUCKETNAME])
#new_df.reset_index(inplace=True)  # Reset the index to remove multi-index from groupby result
#print(new_df.columns)
#print (tableize(new_df))

# calculate the mean of paid conversions 
most_recent_week = pd.Timestamp(today - datetime.timedelta(days=7))
recent_week_df = filtered_df[filtered_df[RPT_COL_DATE] >= most_recent_week]
Mean_recent_week = recent_week_df.groupby([RPT_COL_SBA_BUCKETNAME]).agg({RPT_COL_PPC_UNIQUEVISITORSCONV: 'mean'})

#  Calculate the mean * days remaining in the month
#days_remaining_mean = (datetime.date(today.year, today.month + 1, 1) - today).days
mean_times_remaining = Mean_recent_week * days_remaining

#add mean to df 
#swapped the 2 rows below 1010 drop line was the new row 
new_df = pd.merge(new_df, mean_times_remaining, on=[RPT_COL_SBA_BUCKETNAME])
#new_df = pd.merge(new_df, mean_times_remaining.drop(columns=[RPT_COL_SBA_BUCKETNAME]), on=[RPT_COL_SBA_BUCKETNAME], suffixes=('', '_mean'))

new_df.reset_index(inplace=True)  # Reset the index to remove multi-index from groupby result
#print(new_df.columns)
#print (tableize(new_df))

new_df.columns = ['index_col',RPT_COL_SBA_BUCKETNAME, RPT_COL_ORGANIC_UNIQUEVISITORSCONV, RPT_COL_PPC_UNIQUEVISITORSCONV, 'total_conversions',  'median_times_remaining' , 'mean_times_remaining']



# Step 7: Duplicate the rows for each campaign_name within the campaign_group
campaign_group_df = inputDf[[RPT_COL_SBA_BUCKETNAME, RPT_COL_CAMPAIGN, RPT_COL_SBA_BUDGETMODELTARGET,RPT_COL_ACCOUNT, RPT_COL_SBA_BUCKET_END_DATE]].drop_duplicates()

# Step 8: Combine the two dataframes based on campaign_group
combined_df = pd.merge(campaign_group_df, new_df, on= [RPT_COL_SBA_BUCKETNAME], how='left')



#last 5 days paid conversions by bucket and campaign 
last_5_days = pd.Timestamp(today - datetime.timedelta(days=5))
recent_5_day_df = filtered_df[filtered_df[RPT_COL_DATE] >= last_5_days]
#sum_5_day_paid_conv_df = recent_week_df.groupby([RPT_COL_SBA_BUCKETNAME ]).agg({RPT_COL_PPC_UNIQUEVISITORSCONV: 'sum'})
sum_5_day_paid_conv_df = recent_week_df.groupby([RPT_COL_CAMPAIGN ]).agg({RPT_COL_PPC_UNIQUEVISITORSCONV: 'sum', RPT_COL_PUB_COST: 'sum'})
sum_5_day_paid_conv_df.reset_index(inplace=True) 

sum_5_day_paid_conv_df.columns = [RPT_COL_CAMPAIGN,'sum_5_day_paid', 'sum_5_day_cost']


combined_df = pd.merge(combined_df, sum_5_day_paid_conv_df, on= [RPT_COL_CAMPAIGN], how='left')




combined_df['TMP_FIELD'] = np.nan
combined_df[BULK_COL_SBA_PACING_CHECK] = np.nan
# multiplying by a factor as we are missing some days 
combined_df['TMP_FIELD'] = (combined_df['total_conversions'] + combined_df['median_times_remaining'] + combined_df['mean_times_remaining']) * 1.15
#print('combined################-----------------------------------')
#print(tableize(combined_df))


combined_df.loc[(combined_df[RPT_COL_SBA_BUCKET_END_DATE] < start_of_month) , BULK_COL_SBA_PACING_CHECK ] = 'Bucket not enrolled for this month'
combined_df.loc[(combined_df[RPT_COL_ORGANIC_UNIQUEVISITORSCONV] == 0)& (combined_df[BULK_COL_SBA_PACING_CHECK].isnull())  , BULK_COL_SBA_PACING_CHECK ] = '0 organics this month, predicted total ' + combined_df['TMP_FIELD'].astype(str)
combined_df.loc[ (combined_df[RPT_COL_SBA_BUDGETMODELTARGET] > combined_df['TMP_FIELD']) & (combined_df['sum_5_day_paid'] == 0) & (combined_df[BULK_COL_SBA_PACING_CHECK].isnull()) , BULK_COL_SBA_PACING_CHECK ] = 'Under Pacing no paid, predicted total ' + combined_df['TMP_FIELD'].astype(str)
combined_df.loc[ (combined_df[RPT_COL_SBA_BUDGETMODELTARGET] > combined_df['TMP_FIELD']) & (combined_df['sum_5_day_paid'] < 150) & (combined_df['sum_5_day_cost'] == 0 )  & (combined_df[BULK_COL_SBA_PACING_CHECK].isnull()) , BULK_COL_SBA_PACING_CHECK ] = 'Under Pacing low paid & 0 Cost, predicted total ' + combined_df['TMP_FIELD'].astype(str)
combined_df.loc[ (combined_df[RPT_COL_SBA_BUDGETMODELTARGET] > combined_df['TMP_FIELD']) & (combined_df['sum_5_day_paid'] < 150) & (combined_df[BULK_COL_SBA_PACING_CHECK].isnull()) , BULK_COL_SBA_PACING_CHECK ] = 'Under Pacing low paid, predicted total ' + combined_df['TMP_FIELD'].astype(str)
combined_df.loc[ (combined_df[RPT_COL_SBA_BUDGETMODELTARGET] > combined_df['TMP_FIELD']) & (combined_df[BULK_COL_SBA_PACING_CHECK].isnull()) , BULK_COL_SBA_PACING_CHECK ] = 'Under Pacing, predicted total ' + combined_df['TMP_FIELD'].astype(str)
combined_df.loc[ (combined_df[BULK_COL_SBA_PACING_CHECK].isnull()) , BULK_COL_SBA_PACING_CHECK ] = 'Pacing, predicted total ' + combined_df['TMP_FIELD'].astype(str)


outputDf = combined_df

print('output################-----------------------------------')
print(tableize(outputDf))

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus