Script 1323: Ascend Auto Cap

Purpose:

The Python script automates the process of enabling a campaign override flag when adjusted recommendations exceed predefined caps set in Google Sheets.

To Elaborate

The script is designed to manage campaign overrides in a marketing context by comparing adjusted recommendations against predefined caps. It utilizes data from Google Sheets to establish rules and thresholds for campaign adjustments. When the adjusted recommendations exceed these caps, the script automatically sets an override flag to ensure the campaign adheres to the specified limits. This process helps maintain budget control and optimize campaign performance by preventing excessive spending or underperformance. The script operates by loading data, applying rules, and updating campaign settings based on the comparison results, ensuring that campaigns are adjusted according to the structured budget allocation guidelines.

Walking Through the Code

Initialization and Setup:
- The script begins by determining whether it is running on a server or locally, using a pickle file to load data if running locally.
- It sets up necessary imports and configurations, including timezone settings and pandas display options.
Data Preparation:
- The script loads the primary data source and prepares the input DataFrame, ensuring data types are correctly set for specific columns.
- It retrieves cap rules and campaign settings from Google Sheets, renaming columns to match report column names.
Cap Calculation and Rule Application:
- For each rule defined in the cap rules, the script calculates floor and ceiling caps based on baseline values.
- It merges these caps back into the input DataFrame and checks if the adjusted recommendations exceed the calculated caps.
Override Flag Management:
- The script sets the campaign override flag for rows where recommendations exceed the caps.
- It also manages the override date, enabling or disabling the flag based on the comparison results.
Output Preparation:
- The script prepares the output DataFrame, selecting only changed rows for inclusion in the bulk file.
- It renames columns for output consistency and writes the output to CSV files if running locally for debugging purposes.

Vitals

Script ID : 1323
Client ID / Customer ID: 240065 / 191056
Action Type: Bulk Upload (Preview)
Item Changed: Campaign
Output Columns: Account, Campaign, Campaign Override, AUTO - Campaign Override Date
Linked Datasource: M1 Report
Reference Datasource: Google Sheets
Owner: Michael Huang (mhuang@marinsoftware.com)
Created by Michael Huang on 2024-08-14 15:27
Last Updated by Michael Huang on 2024-11-14 02:33

> See it in Action

Python Code

##
## name: Ascend Auto Cap
## description:
##  Enable Campaign Override flag when Ascend Adj Recommendations exceed caps set in GSheets.
## 
## author: Michael S. Huang
## created: 2024-08-14
## 

########### START - Local Mode Config ###########
# Step 1: Uncomment download_preview_input flag and run Preview successfully with the Datasources you want
download_preview_input=False
# Step 2: In MarinOne, go to Scripts -> Preview -> Logs, download 'dataSourceDict' pickle file, and update pickle_path below
pickle_path = '/Users/mhuang/Downloads/pickle/wex_photo_ascend_auto_cap_20240815.pkl'
# Step 3: Copy this script into local IDE with Python virtual env loaded with pandas and numpy.
# Step 4: Run locally with below code to init dataSourceDict

# determine if code is running on server or locally
def is_executing_on_server():
    try:
        # Attempt to access a known restricted builtin
        dict_items = dataSourceDict.items()
        return True
    except NameError:
        # NameError: dataSourceDict object is missing (indicating not on server)
        return False

local_dev = False

if is_executing_on_server():
    print("Code is executing on server. Skip init.")
elif len(pickle_path) > 3:
    print("Code is NOT executing on server. Doing init.")
    local_dev = True
    # load dataSourceDict via pickled file
    import pickle
    dataSourceDict = pickle.load(open(pickle_path, 'rb'))

    # print shape and first 5 rows for each entry in dataSourceDict
    for key, value in dataSourceDict.items():
        print(f"Shape of dataSourceDict[{key}]: {value.shape}")
        # print(f"First 5 rows of dataSourceDict[{key}]:\n{value.head(5)}")

    # set outputDf same as inputDf
    inputDf = dataSourceDict["1"]
    outputDf = inputDf.copy()

    # setup timezone
    import datetime
    # Chicago Timezone is GMT-5. Adjust as needed.
    CLIENT_TIMEZONE = datetime.timezone(datetime.timedelta(hours=-5))

    # import pandas
    import pandas as pd
    import numpy as np

    # Printing out the version of Python, Pandas and Numpy
    # import sys
    # python_version = sys.version
    # pandas_version = pd.__version__
    # numpy_version = np.__version__

    # print(f"python version: {python_version}")
    # print(f"pandas version: {pandas_version}")
    # print(f"numpy version: {numpy_version}")

    # other imports
    import re
    import urllib

    # import Marin util functions
    from marin_scripts_utils import tableize, select_changed

    # pandas settings
    pd.set_option('display.max_columns', None)  # Display all columns
    pd.set_option('display.max_colwidth', None)  # Display full content of each column

else:
    print("Running locally but no pickle path defined. dataSourceDict not loaded.")
    exit(1)
########### END - Local Mode Setup ###########

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_CAMPAIGN_ID = 'Campaign ID'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_BUDGET_OVERRIDE = 'Campaign Override'
RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE = 'AUTO - Campaign Override Date'
RPT_COL_STRATEGY = 'Strategy'
RPT_COL_STRATEGY_GOAL = 'Strategy Goal'
RPT_COL_STRATEGY_CONSTRAINT = 'Strategy Constraint'
RPT_COL_STRATEGY_TARGET = 'Strategy Target'
RPT_COL_PUBLISHER_BID_STRATEGY = 'Publisher Bid Strategy'
RPT_COL_PUBLISHER_TARGET_CPA = 'Publisher Target CPA'
RPT_COL_PUBLISHER_TARGET_ROAS = 'Publisher Target ROAS'
RPT_COL_REC_CAMPAIGN_TARGET = 'Rec. Campaign Target'
RPT_COL_ADJ_REC_CAMPAIGN_TARGET = 'Adj. Rec. Campaign Target'
RPT_COL_REC_CAMPAIGN_TARGET_DATE = 'Rec. Campaign Target Date'
RPT_COL_PUB_COST = 'Pub. Cost £'
RPT_COL_DAILY_BUDGET = 'Daily Budget'
RPT_COL_REC_CAMPAIGN_BUDGET = 'Rec. Campaign Budget'
RPT_COL_ADJ_REC_CAMPAIGN_BUDGET = 'Adj. Rec. Campaign Budget'
RPT_COL_REC_CAMPAIGN_BUDGET_DATE = 'Rec. Campaign Budget Date'
RPT_COL_RECOMMENDATION_ADJ = 'Recommendation Adj.'

# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_CAMPAIGN_OVERRIDE = 'Campaign Override'
BULK_COL_AUTO__CAMPAIGN_OVERRIDE_DATE = 'AUTO - Campaign Override Date'
outputDf[BULK_COL_CAMPAIGN_OVERRIDE] = "<<YOUR VALUE>>"
outputDf[BULK_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = "<<YOUR VALUE>>"

# internal constants
COL_BASELINE_COLUMN_KEY = 'baseline_col_key'
COL_BASELINE_COLUMN_NAME = 'baseline_col_name'
COL_FLOOR_CAP_PCT = 'floor_cap_pct'
COL_CEILING_CAP_PCT = 'ceiling_cap_pct'
COL_MONITOR_COLUMN_KEY = 'monitor_col_key'
COL_MONITOR_COLUMN_NAME = 'monitor_col_name'

VAL_OVERRIDE_ON = 'On'
VAL_OVERRIDE_OFF = 'Off'
VAL_BLANK = ''

#### user code starts here

print("inputDf shape", inputDf.shape)
print("inputDf dtypes\n", inputDf.dtypes)
print("inputDf sample\n", inputDf.head(5))

originalDf = dataSourceDict["1"]

# Make inputDf a copy of original to keep dataSourceDict pristine
inputDf = originalDf.copy()

## force expected types
# Convert RPT_COL_PUBLISHER_TARGET_ROAS and RPT_COL_PUBLISHER_TARGET_CPA to numeric, coercing errors to NaN
inputDf[RPT_COL_PUBLISHER_TARGET_ROAS] = pd.to_numeric(inputDf[RPT_COL_PUBLISHER_TARGET_ROAS], errors='coerce')
inputDf[RPT_COL_PUBLISHER_TARGET_CPA] = pd.to_numeric(inputDf[RPT_COL_PUBLISHER_TARGET_CPA], errors='coerce')
# Force RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE to be Date type
inputDf[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = pd.to_datetime(inputDf[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE], errors='coerce').dt.date

## prepare caps from GSheets

# sheet 1 contains list of cap rules, with actual cap values on sheet 2
df_cap_rules = dataSourceDict['2_1'].rename(columns={
    'A': COL_BASELINE_COLUMN_KEY,
    'B': COL_BASELINE_COLUMN_NAME,
    'C': COL_FLOOR_CAP_PCT,
    'D': COL_CEILING_CAP_PCT,
    'E': COL_MONITOR_COLUMN_KEY,
    'F': COL_MONITOR_COLUMN_NAME
})

### DEBUGGING, set ceiling to 2%
# df_cap_rules[COL_CEILING_CAP_PCT] = 2

print("cap rules\n", tableize(df_cap_rules))

# Create a dictionary to bring in baseline columns and rename to match report column name
column_mapping = dict(zip(df_cap_rules[COL_BASELINE_COLUMN_KEY], df_cap_rules[COL_BASELINE_COLUMN_NAME]))
column_mapping['A'] = RPT_COL_CAMPAIGN
column_mapping['C'] = RPT_COL_ACCOUNT

# sheet 2 contains snapshot of campaign settings to be used as baseline
df_campaign_caps = dataSourceDict['2_2'][column_mapping.keys()].rename(columns=column_mapping)


# calculate actual campaign-level caps from each rule and save all intermediate columns
df_merged = pd.DataFrame()

for index, rule in df_cap_rules.iterrows():
    baseline_col = rule[COL_BASELINE_COLUMN_NAME]
    monitor_col = rule[COL_MONITOR_COLUMN_NAME]
    floor_cap_pct = rule[COL_FLOOR_CAP_PCT]
    ceiling_cap_pct = rule[COL_CEILING_CAP_PCT]

    # define col names
    COL_BASELINE = f"{baseline_col}_baseline"
    COL_FLOOR = f"{baseline_col}_floor"
    COL_CEILING = f"{baseline_col}_ceiling"
    COL_OVERRIDE = f"{baseline_col}_override"

    print(f"applying rule for baseline '{baseline_col}'")

    # calculate lower and upper caps, but only if original targers are defined and non-zero
    baseline_defined = (df_campaign_caps[baseline_col].notna() & df_campaign_caps[baseline_col] > 0)
    
    print(f"baseline value defined for {sum(baseline_defined)} rows")
    
    # floor should be expressed as NEGATIVE percentage (eg -25%)
    floor = df_campaign_caps[baseline_col] * (1 + floor_cap_pct / 100)
    ceiling = df_campaign_caps[baseline_col] * (1 + ceiling_cap_pct / 100)
    # remove caps if baseline is ill defined
    floor = floor.where(baseline_defined, np.nan)
    ceiling = ceiling.where(baseline_defined, np.nan)

    # copy baseline to ease debugging
    df_campaign_caps[COL_BASELINE] = df_campaign_caps[baseline_col]
    df_campaign_caps[COL_FLOOR] = floor
    df_campaign_caps[COL_CEILING] = ceiling

    # print(f"df_campaign_caps\n", df_campaign_caps.loc[baseline_defined].head(5))

    # merge campaign-level caps back into inputDf
    # Merge the floor and ceiling caps back into inputDf
    df_comparison = inputDf.merge(df_campaign_caps[[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, COL_BASELINE, COL_FLOOR, COL_CEILING]], 
                            how='left', 
                            on=[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN])
    
    print("df_comparion shape", df_comparison.shape)
    print("df_comparion dtypes\n", df_comparison.dtypes)

    # sanity check
    # comparables = df_comparison.loc[df_comparison[COL_CEILING].notna() & df_comparison[monitor_col].notna()]
    # print("df_comparison comparables sample\n", comparables.head(5))

    # check monitoring column (usually the Adjusted Recommendation column; e.g. what would be pushed to publisher)
    exceeded_floor = (df_comparison[COL_FLOOR].notna() & (df_comparison[monitor_col] < df_comparison[COL_FLOOR]))
    exceeded_ceiling = (df_comparison[COL_CEILING].notna() & (df_comparison[monitor_col] > df_comparison[COL_CEILING]))
    should_override = exceeded_floor | exceeded_ceiling
    # set Override flag for this baseline
    df_comparison.loc[should_override, COL_OVERRIDE] = VAL_OVERRIDE_ON
    print(f">> should enable Override for {sum(should_override)} campaigns")

    # Select specific columns to concatenate
    # Note: need to include RPT_COL_BUDGET_OVERRIDE to preserve original value prior to changing them
    columns_to_concatenate = [RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, RPT_COL_BUDGET_OVERRIDE, COL_BASELINE, COL_FLOOR, COL_CEILING, COL_OVERRIDE]
    df_comparison_selected = df_comparison[columns_to_concatenate]

    # Combine for output and debugging
    if df_merged.empty:
        df_merged = df_comparison_selected
    else:
        df_merged = df_merged.merge(df_comparison_selected, 
                                    on=[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, RPT_COL_BUDGET_OVERRIDE], 
                                    how='left', 
                                    suffixes=('', '_new'))


# collect override flags for each rule
override_cols = [f"{col}_override" for col in df_cap_rules[COL_BASELINE_COLUMN_NAME]]
# Build to_override array based on whether a row has any override_cols set to VAL_OVERRIDE_ON
to_override = df_merged[override_cols].apply(lambda row: any(row == VAL_OVERRIDE_ON), axis=1)
print(f">> After applying all rules, will ENABLE Override for {sum(to_override)} campaigns")

# actually enable Override
df_merged.loc[to_override, RPT_COL_BUDGET_OVERRIDE] = VAL_OVERRIDE_ON
df_merged.loc[to_override, RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = today.strftime('%Y-%m-%d')

# disable Override for those not in to_override, if Override Date is set
overriden_by_scripts = (df_merged[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE].astype('str').str.len() >= 10)
to_disable_override = ~to_override & overriden_by_scripts
print(f">> Will also DISABLE Override for {sum(to_disable_override)} campaigns")

# actually disable Override
df_merged.loc[to_disable_override, RPT_COL_BUDGET_OVERRIDE] = VAL_OVERRIDE_OFF
df_merged.loc[to_disable_override, RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = VAL_BLANK

# cleanup RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE
# RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE is a marker to indicate this Script actioned the Override. 
# If not Override, for whatever reason, then a non-blank RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE causes confusion.
has_override_date = (df_merged[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE].astype('str').str.len() >= 10)
override_date_orphans = (df_merged[RPT_COL_BUDGET_OVERRIDE] != VAL_OVERRIDE_ON) & has_override_date
df_merged.loc[override_date_orphans, RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE] = VAL_BLANK
print(f"Cleaned up {override_date_orphans.sum()} orphaned Override Dates")

print("df_merged shape", df_merged.shape)
print("df_merged dtypes\n", df_merged.dtypes)

debugDf = df_merged

## Prepare Output

# fill na with blank string since can't compare na
df_merged[RPT_COL_BUDGET_OVERRIDE].fillna('', inplace=True)
originalDf[RPT_COL_BUDGET_OVERRIDE].fillna('', inplace=True)
df_merged[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE].fillna('', inplace=True)
originalDf[RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE].fillna('', inplace=True)

# only include changed rows in bulk file
print(f"select_changed with df_merged shape {df_merged.shape} and originalDf shape {originalDf.shape}")
(outputDf, debugDf) = select_changed(df_merged, \
                                    originalDf, \
                                    diff_cols = [ \
                                        RPT_COL_BUDGET_OVERRIDE, \
                                    ], \
                                    select_cols = [ \
                                        RPT_COL_ACCOUNT, \
                                        RPT_COL_CAMPAIGN, \
                                        RPT_COL_BUDGET_OVERRIDE, \
                                        RPT_COL_AUTO__CAMPAIGN_OVERRIDE_DATE, \
                                    ], \
                                    merged_cols=[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN] \
                                    )


# remember to use Bulk column header for Override
outputDf = outputDf.rename(columns = { \
                RPT_COL_BUDGET_OVERRIDE: BULK_COL_CAMPAIGN_OVERRIDE \
                })

print("outputDf shape", outputDf.shape)
print("outputDf sample", tableize(outputDf.tail(5)))

## local debug
if local_dev:
    output_filename = 'outputDf.csv'
    outputDf.to_csv(output_filename, index=False)
    print(f"Local Dev: Output written to: {output_filename}")

    debug_filename = 'debugDf.csv'
    debugDf.to_csv(debug_filename, index=False)
    print(f"Local Dev: Debug written to: {debug_filename}")
   

Post generated on 2025-03-11 01:25:51 GMT

14 Aug 2024

« Script 1321: Ascend Strategy analysis Script 1325: Scripts Campaign Tagging »

MarinOne Scripts Creator's Corner