Script 1323: Ascend Auto Cap
Purpose:
The Python script automates the process of enabling a campaign override flag when adjusted recommendations exceed predefined caps set in Google Sheets.
To Elaborate
The script is designed to manage campaign overrides in a marketing context by comparing adjusted recommendations against predefined caps. It utilizes data from Google Sheets to establish rules and thresholds for campaign adjustments. When the adjusted recommendations exceed these caps, the script automatically sets an override flag to ensure the campaign adheres to the specified limits. This process helps maintain budget control and optimize campaign performance by preventing excessive spending or underperformance. The script operates by loading data, applying rules, and updating campaign settings based on the comparison results, ensuring that campaigns are adjusted according to the structured budget allocation guidelines.
Walking Through the Code
- Initialization and Setup:
- The script begins by determining whether it is running on a server or locally, using a pickle file to load data if running locally.
- It sets up necessary imports and configurations, including timezone settings and pandas display options.
- Data Preparation:
- The script loads the primary data source and prepares the input DataFrame, ensuring data types are correctly set for specific columns.
- It retrieves cap rules and campaign settings from Google Sheets, renaming columns to match report column names.
- Cap Calculation and Rule Application:
- For each rule defined in the cap rules, the script calculates floor and ceiling caps based on baseline values.
- It merges these caps back into the input DataFrame and checks if the adjusted recommendations exceed the calculated caps.
- Override Flag Management:
- The script sets the campaign override flag for rows where recommendations exceed the caps.
- It also manages the override date, enabling or disabling the flag based on the comparison results.
- Output Preparation:
- The script prepares the output DataFrame, selecting only changed rows for inclusion in the bulk file.
- It renames columns for output consistency and writes the output to CSV files if running locally for debugging purposes.
Vitals
- Script ID : 1323
- Client ID / Customer ID: 240065 / 191056
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Campaign Override, AUTO - Campaign Override Date
- Linked Datasource: M1 Report
- Reference Datasource: Google Sheets
- Owner: Michael Huang (mhuang@marinsoftware.com)
- Created by Michael Huang on 2024-08-14 15:27
- Last Updated by Michael Huang on 2024-11-14 02:33
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
##
## name: Ascend Auto Cap
## description:
## Enable Campaign Override flag when Ascend Adj Recommendations exceed caps set in GSheets.
##
## author: Michael S. Huang
## created: 2024-08-14
##
########### START - Local Mode Config ###########
# Step 1: Uncomment download_preview_input flag and run Preview successfully with the Datasources you want
download_preview_input=False
# Step 2: In MarinOne, go to Scripts -> Preview -> Logs, download 'dataSourceDict' pickle file, and update pickle_path below
pickle_path = '/Users/mhuang/Downloads/pickle/wex_photo_ascend_auto_cap_20240815.pkl'
# Step 3: Copy this script into local IDE with Python virtual env loaded with pandas and numpy.
# Step 4: Run locally with below code to init dataSourceDict
# determine if code is running on server or locally
def is_executing_on_server():
try:
# Attempt to access a known restricted builtin
dict_items = dataSourceDict.items()
return True
except NameError:
# NameError: dataSourceDict object is missing (indicating not on server)
return False
local_dev = False
if is_executing_on_server():
print("Code is executing on server. Skip init.")
elif len(pickle_path) > 3:
print("Code is NOT executing on server. Doing init.")
local_dev = True
# load dataSourceDict via pickled file
import pickle
dataSourceDict = pickle.load(open(pickle_path, 'rb'))
# print shape and first 5 rows for each entry in dataSourceDict
for key, value in dataSourceDict.items():
print(f"Shape of dataSourceDict[{key}]: {value.shape}")
# print(f"First 5 rows of dataSourceDict[{key}]:\n{value.head(5)}")
# set outputDf same as inputDf
inputDf = dataSourceDict["1"]
outputDf = inputDf.copy()
# setup timezone
import datetime
# Chicago Timezone is GMT-5. Adjust as needed.
CLIENT_TIMEZONE = datetime.timezone(datetime.timedelta(hours=-5))
# import pandas
import pandas as pd
import numpy as np
# Printing out the version of Python, Pandas and Numpy
# import sys
# python_version = sys.version
# pandas_version = pd.__version__
# numpy_version = np.__version__
# print(f"python version: {python_version}")
# print(f"pandas version: {pandas_version}")
# print(f"numpy version: {numpy_version}")
# other imports
import re
import urllib
# import Marin util functions
from marin_scripts_utils import tableize, select_changed
# pandas settings
pd.set_option('display.max_columns', None) # Display all columns
pd.set_option('display.max_colwidth', None) # Display full content of each column
else:
print("Running locally but no pickle path defined. dataSourceDict not loaded.")
exit(1)
########### END - Local Mode Setup ###########
today = datetime.datetime.now(CLIENT_TIMEZONE).date()
# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_CAMPAIGN_ID = 'Campaign ID'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_BUDGET_OVERRIDE = 'Campaign Override'
RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE = 'AUTO - Campaign Override Date'
RPT_COL_STRATEGY = 'Strategy'
RPT_COL_STRATEGY_GOAL = 'Strategy Goal'
RPT_COL_STRATEGY_CONSTRAINT = 'Strategy Constraint'
RPT_COL_STRATEGY_TARGET = 'Strategy Target'
RPT_COL_PUBLISHER_BID_STRATEGY = 'Publisher Bid Strategy'
RPT_COL_PUBLISHER_TARGET_CPA = 'Publisher Target CPA'
RPT_COL_PUBLISHER_TARGET_ROAS = 'Publisher Target ROAS'
RPT_COL_REC_CAMPAIGN_TARGET = 'Rec. Campaign Target'
RPT_COL_ADJ_REC_CAMPAIGN_TARGET = 'Adj. Rec. Campaign Target'
RPT_COL_REC_CAMPAIGN_TARGET_DATE = 'Rec. Campaign Target Date'
RPT_COL_PUB_COST = 'Pub. Cost £'
RPT_COL_DAILY_BUDGET = 'Daily Budget'
RPT_COL_REC_CAMPAIGN_BUDGET = 'Rec. Campaign Budget'
RPT_COL_ADJ_REC_CAMPAIGN_BUDGET = 'Adj. Rec. Campaign Budget'
RPT_COL_REC_CAMPAIGN_BUDGET_DATE = 'Rec. Campaign Budget Date'
RPT_COL_RECOMMENDATION_ADJ = 'Recommendation Adj.'
# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_CAMPAIGN_OVERRIDE = 'Campaign Override'
BULK_COL_AUTO__CAMPAIGN_OVERRIDE_DATE = 'AUTO - Campaign Override Date'
outputDf[BULK_COL_CAMPAIGN_OVERRIDE] = "<<YOUR VALUE>>"
outputDf[BULK_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = "<<YOUR VALUE>>"
# internal constants
COL_BASELINE_COLUMN_KEY = 'baseline_col_key'
COL_BASELINE_COLUMN_NAME = 'baseline_col_name'
COL_FLOOR_CAP_PCT = 'floor_cap_pct'
COL_CEILING_CAP_PCT = 'ceiling_cap_pct'
COL_MONITOR_COLUMN_KEY = 'monitor_col_key'
COL_MONITOR_COLUMN_NAME = 'monitor_col_name'
VAL_OVERRIDE_ON = 'On'
VAL_OVERRIDE_OFF = 'Off'
VAL_BLANK = ''
#### user code starts here
print("inputDf shape", inputDf.shape)
print("inputDf dtypes\n", inputDf.dtypes)
print("inputDf sample\n", inputDf.head(5))
originalDf = dataSourceDict["1"]
# Make inputDf a copy of original to keep dataSourceDict pristine
inputDf = originalDf.copy()
## force expected types
# Convert RPT_COL_PUBLISHER_TARGET_ROAS and RPT_COL_PUBLISHER_TARGET_CPA to numeric, coercing errors to NaN
inputDf[RPT_COL_PUBLISHER_TARGET_ROAS] = pd.to_numeric(inputDf[RPT_COL_PUBLISHER_TARGET_ROAS], errors='coerce')
inputDf[RPT_COL_PUBLISHER_TARGET_CPA] = pd.to_numeric(inputDf[RPT_COL_PUBLISHER_TARGET_CPA], errors='coerce')
# Force RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE to be Date type
inputDf[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = pd.to_datetime(inputDf[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE], errors='coerce').dt.date
## prepare caps from GSheets
# sheet 1 contains list of cap rules, with actual cap values on sheet 2
df_cap_rules = dataSourceDict['2_1'].rename(columns={
'A': COL_BASELINE_COLUMN_KEY,
'B': COL_BASELINE_COLUMN_NAME,
'C': COL_FLOOR_CAP_PCT,
'D': COL_CEILING_CAP_PCT,
'E': COL_MONITOR_COLUMN_KEY,
'F': COL_MONITOR_COLUMN_NAME
})
### DEBUGGING, set ceiling to 2%
# df_cap_rules[COL_CEILING_CAP_PCT] = 2
print("cap rules\n", tableize(df_cap_rules))
# Create a dictionary to bring in baseline columns and rename to match report column name
column_mapping = dict(zip(df_cap_rules[COL_BASELINE_COLUMN_KEY], df_cap_rules[COL_BASELINE_COLUMN_NAME]))
column_mapping['A'] = RPT_COL_CAMPAIGN
column_mapping['C'] = RPT_COL_ACCOUNT
# sheet 2 contains snapshot of campaign settings to be used as baseline
df_campaign_caps = dataSourceDict['2_2'][column_mapping.keys()].rename(columns=column_mapping)
# calculate actual campaign-level caps from each rule and save all intermediate columns
df_merged = pd.DataFrame()
for index, rule in df_cap_rules.iterrows():
baseline_col = rule[COL_BASELINE_COLUMN_NAME]
monitor_col = rule[COL_MONITOR_COLUMN_NAME]
floor_cap_pct = rule[COL_FLOOR_CAP_PCT]
ceiling_cap_pct = rule[COL_CEILING_CAP_PCT]
# define col names
COL_BASELINE = f"{baseline_col}_baseline"
COL_FLOOR = f"{baseline_col}_floor"
COL_CEILING = f"{baseline_col}_ceiling"
COL_OVERRIDE = f"{baseline_col}_override"
print(f"applying rule for baseline '{baseline_col}'")
# calculate lower and upper caps, but only if original targers are defined and non-zero
baseline_defined = (df_campaign_caps[baseline_col].notna() & df_campaign_caps[baseline_col] > 0)
print(f"baseline value defined for {sum(baseline_defined)} rows")
# floor should be expressed as NEGATIVE percentage (eg -25%)
floor = df_campaign_caps[baseline_col] * (1 + floor_cap_pct / 100)
ceiling = df_campaign_caps[baseline_col] * (1 + ceiling_cap_pct / 100)
# remove caps if baseline is ill defined
floor = floor.where(baseline_defined, np.nan)
ceiling = ceiling.where(baseline_defined, np.nan)
# copy baseline to ease debugging
df_campaign_caps[COL_BASELINE] = df_campaign_caps[baseline_col]
df_campaign_caps[COL_FLOOR] = floor
df_campaign_caps[COL_CEILING] = ceiling
# print(f"df_campaign_caps\n", df_campaign_caps.loc[baseline_defined].head(5))
# merge campaign-level caps back into inputDf
# Merge the floor and ceiling caps back into inputDf
df_comparison = inputDf.merge(df_campaign_caps[[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, COL_BASELINE, COL_FLOOR, COL_CEILING]],
how='left',
on=[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN])
print("df_comparion shape", df_comparison.shape)
print("df_comparion dtypes\n", df_comparison.dtypes)
# sanity check
# comparables = df_comparison.loc[df_comparison[COL_CEILING].notna() & df_comparison[monitor_col].notna()]
# print("df_comparison comparables sample\n", comparables.head(5))
# check monitoring column (usually the Adjusted Recommendation column; e.g. what would be pushed to publisher)
exceeded_floor = (df_comparison[COL_FLOOR].notna() & (df_comparison[monitor_col] < df_comparison[COL_FLOOR]))
exceeded_ceiling = (df_comparison[COL_CEILING].notna() & (df_comparison[monitor_col] > df_comparison[COL_CEILING]))
should_override = exceeded_floor | exceeded_ceiling
# set Override flag for this baseline
df_comparison.loc[should_override, COL_OVERRIDE] = VAL_OVERRIDE_ON
print(f">> should enable Override for {sum(should_override)} campaigns")
# Select specific columns to concatenate
# Note: need to include RPT_COL_BUDGET_OVERRIDE to preserve original value prior to changing them
columns_to_concatenate = [RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, RPT_COL_BUDGET_OVERRIDE, COL_BASELINE, COL_FLOOR, COL_CEILING, COL_OVERRIDE]
df_comparison_selected = df_comparison[columns_to_concatenate]
# Combine for output and debugging
if df_merged.empty:
df_merged = df_comparison_selected
else:
df_merged = df_merged.merge(df_comparison_selected,
on=[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, RPT_COL_BUDGET_OVERRIDE],
how='left',
suffixes=('', '_new'))
# collect override flags for each rule
override_cols = [f"{col}_override" for col in df_cap_rules[COL_BASELINE_COLUMN_NAME]]
# Build to_override array based on whether a row has any override_cols set to VAL_OVERRIDE_ON
to_override = df_merged[override_cols].apply(lambda row: any(row == VAL_OVERRIDE_ON), axis=1)
print(f">> After applying all rules, will ENABLE Override for {sum(to_override)} campaigns")
# actually enable Override
df_merged.loc[to_override, RPT_COL_BUDGET_OVERRIDE] = VAL_OVERRIDE_ON
df_merged.loc[to_override, RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = today.strftime('%Y-%m-%d')
# disable Override for those not in to_override, if Override Date is set
overriden_by_scripts = (df_merged[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE].astype('str').str.len() >= 10)
to_disable_override = ~to_override & overriden_by_scripts
print(f">> Will also DISABLE Override for {sum(to_disable_override)} campaigns")
# actually disable Override
df_merged.loc[to_disable_override, RPT_COL_BUDGET_OVERRIDE] = VAL_OVERRIDE_OFF
df_merged.loc[to_disable_override, RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = VAL_BLANK
# cleanup RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE
# RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE is a marker to indicate this Script actioned the Override.
# If not Override, for whatever reason, then a non-blank RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE causes confusion.
has_override_date = (df_merged[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE].astype('str').str.len() >= 10)
override_date_orphans = (df_merged[RPT_COL_BUDGET_OVERRIDE] != VAL_OVERRIDE_ON) & has_override_date
df_merged.loc[override_date_orphans, RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = VAL_BLANK
print(f"Cleaned up {override_date_orphans.sum()} orphaned Override Dates")
print("df_merged shape", df_merged.shape)
print("df_merged dtypes\n", df_merged.dtypes)
debugDf = df_merged
## Prepare Output
# fill na with blank string since can't compare na
df_merged[RPT_COL_BUDGET_OVERRIDE].fillna('', inplace=True)
originalDf[RPT_COL_BUDGET_OVERRIDE].fillna('', inplace=True)
df_merged[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE].fillna('', inplace=True)
originalDf[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE].fillna('', inplace=True)
# only include changed rows in bulk file
print(f"select_changed with df_merged shape {df_merged.shape} and originalDf shape {originalDf.shape}")
(outputDf, debugDf) = select_changed(df_merged, \
originalDf, \
diff_cols = [ \
RPT_COL_BUDGET_OVERRIDE, \
], \
select_cols = [ \
RPT_COL_ACCOUNT, \
RPT_COL_CAMPAIGN, \
RPT_COL_BUDGET_OVERRIDE, \
RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE, \
], \
merged_cols=[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN] \
)
# remember to use Bulk column header for Override
outputDf = outputDf.rename(columns = { \
RPT_COL_BUDGET_OVERRIDE: BULK_COL_CAMPAIGN_OVERRIDE \
})
print("outputDf shape", outputDf.shape)
print("outputDf sample", tableize(outputDf.tail(5)))
## local debug
if local_dev:
output_filename = 'outputDf.csv'
outputDf.to_csv(output_filename, index=False)
print(f"Local Dev: Output written to: {output_filename}")
debug_filename = 'debugDf.csv'
debugDf.to_csv(debug_filename, index=False)
print(f"Local Dev: Debug written to: {debug_filename}")
Post generated on 2025-03-11 01:25:51 GMT