Script 1787: Auto Pause Webinar Ads

Purpose:

The Python script automates the pausing and resuming of webinar ads based on their month-to-date (MTD) spending relative to a predefined monthly budget.

To Elaborate

The script is designed to manage the budget allocation for webinar ads by automatically pausing ads when their month-to-date (MTD) spending exceeds a specified monthly budget, and resuming them when the spending falls below the budget. This is achieved by comparing the MTD spend of each ad against a monthly budget threshold, adjusted by a safety margin to account for system lag and non-linear intraday spending. The script ensures that ads are paused or resumed based on their spending status, helping to maintain budget discipline and optimize ad performance.

Walking Through the Code

Configuration and Setup
- The script begins by defining configurable parameters, including a safety margin (BUDGET_CAP_SAFETY_MARGIN) and the monthly budget for webinar ads (WEBINAR_AD_MONTHLY_BUDGET).
- It checks whether the script is running on a server or locally, and loads necessary data from a pickle file if running locally.
Data Preparation
- The script initializes the primary data source (inputDf) and makes a copy to preserve the original data.
- It forces specific data types for certain columns and replaces any NaN values with empty strings to prevent comparison errors.
Ad Management Logic
- The script identifies ads that have exceeded the budget threshold and marks them for pausing by setting a ‘pause’ flag.
- It updates the status of these ads to ‘Paused’ and records the date of this action.
- Similarly, it identifies ads that can be resumed (those under budget and previously paused) and updates their status to ‘Active’.
Output Preparation
- The script cleans up any orphaned pause dates to avoid confusion.
- It uses a utility function to select only the changed rows for output, ensuring that only necessary updates are made.
- If running locally, it writes the output and debug data to CSV files for further analysis.

Vitals

Script ID : 1787
Client ID / Customer ID: 1306922277 / 60268979
Action Type: Bulk Upload (Preview)
Item Changed: Ad
Output Columns: Account, Campaign, Group, Creative ID, Status, Webinar Creative Auto Pause Date
Linked Datasource: M1 Report
Reference Datasource: None
Owner: ascott@marinsoftware.com (ascott@marinsoftware.com)
Created by ascott@marinsoftware.com on 2025-03-07 20:14
Last Updated by Michael Huang on 2025-03-10 09:07

> See it in Action

Python Code

### name: Webinar Ads Budget Cap
## description:
##  Pause Ads when MTD spend reaches Monthly Budget
##  Resume previously paused Ads when MTD spend falls below Monthly Budget
## 
## author: Michael S. Huang
## created: 2025-03-10
## 

##### Configurable Param #####
# Define how close MTD spend can get to Monthly Budget before being Paused
#  - compensates for lag in system
#  - compendates for non-linearity in intraday spend
BUDGET_CAP_SAFETY_MARGIN = 0.01 # set to 1%
WEBINAR_AD_MONTHLY_BUDGET = 100 # $100 USD per ad per month
##############################


########### START - Local Mode Config ###########
# Step 1: Uncomment download_preview_input flag and run Preview successfully with the Datasources you want
download_preview_input=False
# Step 2: In MarinOne, go to Scripts -> Preview -> Logs, download 'dataSourceDict' pickle file, and update pickle_path below
pickle_path = '/Users/mhuang/Downloads/pickle/alumni_venture_webinar_ads_autopause_20250310.pkl'
# Step 3: Copy this script into local IDE with Python virtual env loaded with pandas and numpy.
# Step 4: Run locally with below code to init dataSourceDict

# determine if code is running on server or locally
def is_executing_on_server():
    try:
        # Attempt to access a known restricted builtin
        dict_items = dataSourceDict.items()
        return True
    except NameError:
        # NameError: dataSourceDict object is missing (indicating not on server)
        return False

local_dev = False

if is_executing_on_server():
    print("Code is executing on server. Skip init.")
elif len(pickle_path) > 3:
    print("Code is NOT executing on server. Doing init.")
    local_dev = True
    # load dataSourceDict via pickled file
    import pickle
    dataSourceDict = pickle.load(open(pickle_path, 'rb'))

    # print shape and first 5 rows for each entry in dataSourceDict
    for key, value in dataSourceDict.items():
        print(f"Shape of dataSourceDict[{key}]: {value.shape}")
        # print(f"First 5 rows of dataSourceDict[{key}]:\n{value.head(5)}")

    # set outputDf same as inputDf
    inputDf = dataSourceDict["1"]
    outputDf = inputDf.copy()

    # setup timezone
    import datetime
    # Chicago Timezone is GMT-5. Adjust as needed.
    CLIENT_TIMEZONE = datetime.timezone(datetime.timedelta(hours=-5))

    # import pandas
    import pandas as pd
    import numpy as np

    # Printing out the version of Python, Pandas and Numpy
    # import sys
    # python_version = sys.version
    # pandas_version = pd.__version__
    # numpy_version = np.__version__

    # print(f"python version: {python_version}")
    # print(f"pandas version: {pandas_version}")
    # print(f"numpy version: {numpy_version}")

    # other imports
    import re
    import urllib

    # import Marin util functions
    from marin_scripts_utils import tableize, select_changed
else:
    print("Running locally but no pickle path defined. dataSourceDict not loaded.")
    exit(1)
########### END - Local Mode Setup ###########

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_PUB_ID = 'Pub. ID'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP = 'Group'
RPT_COL_STATUS = 'Status'
RPT_COL_CREATIVE_TYPE = 'Creative Type'
RPT_COL_IMPR = 'Impr.'
RPT_COL_CLICKS = 'Clicks'
RPT_COL_CTR = 'CTR %'
RPT_COL_PUBLISHER = 'Publisher'
RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE = 'Webinar Creative Auto Pause Date'
RPT_COL_CREATIVE_ID = 'Creative ID'

# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_GROUP = 'Group'
BULK_COL_CREATIVE_ID = 'Creative ID'
BULK_COL_STATUS = 'Status'
BULK_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE = 'Webinar Creative Auto Pause Date'
outputDf[BULK_COL_STATUS] = "<<YOUR VALUE>>"
outputDf[BULK_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE] = "<<YOUR VALUE>>"

### User Code Starts Here

originalDf = dataSourceDict["1"]

# Make inputDf a copy of original to keep dataSourceDict pristine
inputDf = originalDf.copy()

# define Status values
VAL_STATUS_ACTIVE = 'Active'
VAL_STATUS_PAUSED = 'Paused'
VAL_BLANK = ''

print("inputDf shape", inputDf.shape)
print("inputDf info", inputDf.info())

## force expected types
# Force RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE to be Date type
inputDf[RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE] = pd.to_datetime(inputDf[RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE], errors='coerce').dt.date


# HACK: replace nan with empty strings so comparison doesn't fail
inputDf.fillna(VAL_BLANK, inplace=True)

# Recommend to Pause Ads with MTD Spend over Monthly Budget (by a margin)
over_spent_ads = inputDf[RPT_COL_PUB_COST] >= WEBINAR_AD_MONTHLY_BUDGET * (1 - BUDGET_CAP_SAFETY_MARGIN)
ads_to_pause = over_spent_ads

inputDf.loc[ads_to_pause, 'pause'] = 1
print(f"ads_to_pause count: {sum(ads_to_pause)}")
if ads_to_pause.any():
    print("ads_to_pause ads", tableize(inputDf.loc[ads_to_pause].head()))

inputDf.loc[ ads_to_pause, \
             [RPT_COL_STATUS, RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE] \
           ] = [VAL_STATUS_PAUSED, today.strftime('%Y-%m-%d')]

# Recommend to reactivate Ads with MTD pend under Monthly Budget (by a margin)
# but limited to Ads with Pause Date populated with 10 digit dates
under_spent_ads = inputDf[RPT_COL_PUB_COST] < WEBINAR_AD_MONTHLY_BUDGET * (1 - BUDGET_CAP_SAFETY_MARGIN)
autopaused_ads = inputDf[RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE].astype('str').str.len() >= 10
ads_to_resume = under_spent_ads & autopaused_ads

inputDf.loc[ads_to_resume, 'resume'] = 1
print(f"ads_to_resume count: {sum(ads_to_resume)}")
if ads_to_resume.any():
    print("ads_to_resume", tableize(inputDf.loc[ads_to_resume].head()))

inputDf.loc[ ads_to_resume, \
             [RPT_COL_STATUS, RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE] \
           ] = [VAL_STATUS_ACTIVE, VAL_BLANK]

## Prepare Output

# Cleanup. RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE is a marker to indicate this Script actioned the Pause. If not Paused, for whatever reason, then a non-blank RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE causes confusion. 
orphan_pause_date = autopaused_ads & (inputDf[RPT_COL_STATUS] == VAL_STATUS_ACTIVE)
inputDf.loc[orphan_pause_date, RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE] = VAL_BLANK
print(f"Cleaned up {orphan_pause_date.sum()} orphaned {RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE}")

# only include changed rows in bulk file
print(f"select_changed with inputDf shape {inputDf.shape} and originalDf shape {originalDf.shape}")
(outputDf, debugDf) = select_changed(inputDf, \
                                    originalDf, \
                                    diff_cols = [ \
                                        RPT_COL_STATUS, \
                                    ], \
                                    select_cols = [ \
                                        RPT_COL_ACCOUNT, \
                                        RPT_COL_CAMPAIGN, \
                                        RPT_COL_GROUP, \
                                        RPT_COL_CREATIVE_ID, \
                                        RPT_COL_STATUS, \
                                        RPT_COL_WEBINAR_CREATIVE_AUTO_PAUSE_DATE, \
                                    ], \
                                    merged_cols=[RPT_COL_PUBLISHER, RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, RPT_COL_GROUP, RPT_COL_CREATIVE_ID] \
                                    )


print("outputDf shape", outputDf.shape)
print("outputDf", tableize(outputDf.tail(5)))

## local debug
if local_dev:
    output_filename = 'outputDf.csv'
    outputDf.to_csv(output_filename, index=False)
    print(f"Local Dev: Output written to: {output_filename}")

    debug_filename = 'debugDf.csv'
    debugDf.to_csv(debug_filename, index=False)
    print(f"Local Dev: Debug written to: {debug_filename}")
   

Post generated on 2025-03-11 01:25:51 GMT

07 Mar 2025

« Script 1785: Webinar Creative AutoPause

MarinOne Scripts Creator's Corner