Script 459: Epicor Budget Cap

Purpose:

The Python script manages advertising campaigns by pausing them when the month-to-date (MTD) spend reaches the Epicor Monthly Budget, ensuring budget compliance.

To Elaborate

The Python script titled ‘Epicor Budget Cap’ is designed to manage advertising campaigns by monitoring their spending against a predefined monthly budget. The script aims to pause campaigns when their month-to-date (MTD) expenditure approaches or exceeds the Epicor Monthly Budget, factoring in a safety margin to account for system lag and non-linear spending patterns. It also recommends reactivating campaigns that have spent below the budget threshold, provided they were previously paused. This ensures that campaigns do not overspend while allowing flexibility to resume them if they are under budget. The script operates by analyzing campaign data, calculating MTD spend, and adjusting campaign statuses accordingly, thereby maintaining structured budget allocation (SBA) compliance.

Walking Through the Code

Initialization and Configuration:
- The script begins by setting a configurable parameter, BUDGET_CAP_SAFETY_MARGIN, which determines how close the MTD spend can get to the monthly budget before pausing campaigns.
- It checks whether the code is running on a server or locally, and if local, it loads data from a pickle file to initialize the dataSourceDict.
Data Preparation:
- The script prepares the data by filling missing values and converting necessary columns to appropriate data types.
- It calculates the MTD Epicor Spend for each campaign by summing the publisher costs grouped by Epicor ID.
Campaign Status Evaluation:
- Campaigns are evaluated for pausing if their MTD spend exceeds the monthly budget, adjusted by the safety margin, and they are currently active.
- Conversely, campaigns are evaluated for resuming if their MTD spend is below the budget threshold and they have a populated pause date.
Status Update and Cleanup:
- The script updates the campaign status based on the evaluations, marking campaigns to pause or resume.
- It cleans up any orphaned pause dates to avoid confusion if a campaign is not paused.
Output Preparation:
- The script prepares the output by selecting changed rows and renaming columns for bulk processing.
- If running locally, it writes the output and debug information to CSV files for further analysis.

Vitals

Script ID : 459
Client ID / Customer ID: 1306917127 / 60268084
Action Type: Bulk Upload
Item Changed: Campaign
Output Columns: Account, Campaign, Status, SBA Recommended Status, SBA Pause Date
Linked Datasource: M1 Report
Reference Datasource: None
Owner: Michael Huang (mhuang@marinsoftware.com)
Created by Michael Huang on 2023-10-31 13:27
Last Updated by Michael Huang on 2024-03-08 22:12

> See it in Action

Python Code

##
## name: Epicor Budget Cap
## description:
##  Pause campaigns when MTD spend reaches Epicor Monthly Budget
## 
## author: Michael S. Huang
## created: 2023-10-31
## 

##### Configurable Param #####
# Define how close MTD spend can get to Monthly Budget before being Paused
#  - compensates for lag in system
#  - compendates for non-linearity in intraday spend
BUDGET_CAP_SAFETY_MARGIN = 0.02 # set to 2%
##############################


########### START - Local Mode Config ###########
# Step 1: Uncomment download_preview_input flag and run Preview successfully with the Datasources you want
download_preview_input=False
# Step 2: In MarinOne, go to Scripts -> Preview -> Logs, download 'dataSourceDict' pickle file, and update pickle_path below
pickle_path = '/Users/mhuang/Downloads/pickle/heartland_dental_cap_20240308.pkl'
# Step 3: Copy this script into local IDE with Python virtual env loaded with pandas and numpy.
# Step 4: Run locally with below code to init dataSourceDict

# determine if code is running on server or locally
def is_executing_on_server():
    try:
        # Attempt to access a known restricted builtin
        dict_items = dataSourceDict.items()
        return True
    except NameError:
        # NameError: dataSourceDict object is missing (indicating not on server)
        return False

local_dev = False

if is_executing_on_server():
    print("Code is executing on server. Skip init.")
elif len(pickle_path) > 3:
    print("Code is NOT executing on server. Doing init.")
    local_dev = True
    # load dataSourceDict via pickled file
    import pickle
    dataSourceDict = pickle.load(open(pickle_path, 'rb'))

    # print shape and first 5 rows for each entry in dataSourceDict
    for key, value in dataSourceDict.items():
        print(f"Shape of dataSourceDict[{key}]: {value.shape}")
        # print(f"First 5 rows of dataSourceDict[{key}]:\n{value.head(5)}")

    # set outputDf same as inputDf
    inputDf = dataSourceDict["1"]
    outputDf = inputDf.copy()

    # setup timezone
    import datetime
    # Chicago Timezone is GMT-5. Adjust as needed.
    CLIENT_TIMEZONE = datetime.timezone(datetime.timedelta(hours=-5))

    # import pandas
    import pandas as pd
    import numpy as np

    # Printing out the version of Python, Pandas and Numpy
    # import sys
    # python_version = sys.version
    # pandas_version = pd.__version__
    # numpy_version = np.__version__

    # print(f"python version: {python_version}")
    # print(f"pandas version: {pandas_version}")
    # print(f"numpy version: {numpy_version}")

    # other imports
    import re
    import urllib

    # import Marin util functions
    from marin_scripts_utils import tableize, select_changed
else:
    print("Running locally but no pickle path defined. dataSourceDict not loaded.")
    exit(1)
########### END - Local Mode Setup ###########

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_PUBLISHER_NAME = 'Publisher Name'
RPT_COL_STRATEGY = 'Strategy'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_SBA_TRAFFIC = 'SBA Traffic'
RPT_COL_SBA_RECOMMENDED_STATUS = 'SBA Recommended Status'
RPT_COL_SBA_PAUSE_DATE = 'SBA Pause Date'
RPT_COL_DAILY_BUDGET = 'Daily Budget'
RPT_COL_SBA_CALCULATED_BUDGET_DAILY = 'SBA Calculated Budget Daily'
RPT_COL_SBA_BUDGET_PACING = 'SBA Budget Pacing'
RPT_COL_SBA_ALLOCATION = 'SBA Allocation'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_EPICORID = 'epicorID'
RPT_COL_EPICORID_MONTHLY_BUDGET = 'epicorID - Monthly Budget'

# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_STATUS = 'Status'
BULK_COL_SBA_PAUSE_DATE = 'SBA Pause Date'
BULK_COL_SBA_RECOMMENDED_STATUS = 'SBA Recommended Status'

# Make inputDf a copy of original for comparison via select_change() later
originalDf = dataSourceDict["1"]
inputDf = originalDf.copy()

# define some intermediate columns
COL_MTD_EPICOR_SPEND = 'mtd_epicor_spend'

# define Status values
VAL_STATUS_ACTIVE = 'Active'
VAL_STATUS_PAUSED = 'Paused'
VAL_BLANK = ''

## Fix types and values

# Replace nan with empty strings to declutter output
inputDf.fillna(VAL_BLANK, inplace=True)
# Convert RPT_COL_EPICORID_MONTHLY_BUDGET to numeric, coercing errors to NaN
inputDf[RPT_COL_EPICORID_MONTHLY_BUDGET] = pd.to_numeric(inputDf[RPT_COL_EPICORID_MONTHLY_BUDGET], errors='coerce')
# Replace NaN values with 0.0 if that's the desired behavior
inputDf[RPT_COL_EPICORID_MONTHLY_BUDGET].fillna(0.0, inplace=True)
# Force RPT_COL_SBA_PAUSE_DATE to be String type
inputDf[RPT_COL_SBA_PAUSE_DATE] = inputDf[RPT_COL_SBA_PAUSE_DATE].astype(str)

print("inputDf shape", inputDf.shape)
print("inputDf info", inputDf.info())


# Calculate MTD Epicor Spend
inputDf[COL_MTD_EPICOR_SPEND] = inputDf.groupby(RPT_COL_EPICORID)[RPT_COL_PUB_COST].transform('sum')

# Recommend to Pause camapigns with MTD Epicor Spend over Monthly Epicor Budget (by a margin)
has_epicor_budget = inputDf[RPT_COL_EPICORID_MONTHLY_BUDGET] > 0.0
over_spent_campaigns = inputDf[COL_MTD_EPICOR_SPEND] >= inputDf[RPT_COL_EPICORID_MONTHLY_BUDGET] * (1 - BUDGET_CAP_SAFETY_MARGIN)
active_campaigns = inputDf[RPT_COL_CAMPAIGN_STATUS] == VAL_STATUS_ACTIVE
campaigns_to_pause = active_campaigns & has_epicor_budget & over_spent_campaigns

inputDf.loc[campaigns_to_pause, 'pause'] = 1
print(f"campaigns_to_pause count: {sum(campaigns_to_pause)}")

inputDf.loc[ campaigns_to_pause, \
             RPT_COL_SBA_RECOMMENDED_STATUS \
           ] = VAL_STATUS_PAUSED
 
# Recommend to reactivate campaigns with MTD Epicor Spend under Monthly Epicor Budget (by a margin)
# but limited to campaigns with SBA Pause Date populated 10 digit date
under_spent_campaigns = inputDf[COL_MTD_EPICOR_SPEND] < inputDf[RPT_COL_EPICORID_MONTHLY_BUDGET] * (1 - BUDGET_CAP_SAFETY_MARGIN)
sba_paused_campaigns = inputDf[RPT_COL_SBA_PAUSE_DATE].astype('str').str.len() >= 10
paused_campaigns = inputDf[RPT_COL_CAMPAIGN_STATUS] == VAL_STATUS_PAUSED
campaigns_to_resume = paused_campaigns & sba_paused_campaigns & under_spent_campaigns 

inputDf.loc[campaigns_to_resume, 'resume'] = 1
print(f"campaigns_to_resume count: {sum(campaigns_to_resume)}")

inputDf.loc[ campaigns_to_resume, \
             RPT_COL_SBA_RECOMMENDED_STATUS  \
           ] = VAL_STATUS_ACTIVE

# Clear Rec Status for everyone else
campaigns_to_clear = ~(campaigns_to_pause | campaigns_to_resume)

inputDf.loc[campaigns_to_clear, 'clear'] = 1
print(f"campaigns_to_clear count: {sum(campaigns_to_clear)}")

inputDf.loc[ campaigns_to_clear, \
             RPT_COL_SBA_RECOMMENDED_STATUS  \
           ] = VAL_BLANK

## Actually taffic PAUSE

should_traffic = inputDf[RPT_COL_SBA_TRAFFIC].astype(str).str.lower() == 'traffic'

should_traffic_pause = should_traffic & \
                       (inputDf[RPT_COL_SBA_RECOMMENDED_STATUS] == VAL_STATUS_PAUSED) & \
                       (inputDf[RPT_COL_SBA_RECOMMENDED_STATUS] != inputDf[RPT_COL_CAMPAIGN_STATUS])


inputDf.loc[should_traffic_pause, 'traffic_pause'] = 1
print(f"should_traffic_pause count: {sum(should_traffic_pause)}")


inputDf.loc[should_traffic_pause, RPT_COL_CAMPAIGN_STATUS] = inputDf.loc[should_traffic_pause, RPT_COL_SBA_RECOMMENDED_STATUS]
inputDf.loc[should_traffic_pause, RPT_COL_SBA_PAUSE_DATE] = today.strftime('%Y-%m-%d')

## Actually taffic RESUME

should_traffic_resume = should_traffic & \
                       (inputDf[RPT_COL_SBA_RECOMMENDED_STATUS] == VAL_STATUS_ACTIVE) & \
                       (inputDf[RPT_COL_SBA_RECOMMENDED_STATUS] != inputDf[RPT_COL_CAMPAIGN_STATUS]) & \
                       sba_paused_campaigns

inputDf.loc[should_traffic_resume, 'traffic_resume'] = 1
print(f"should_traffic_resume count: {sum(should_traffic_resume)}")

inputDf.loc[should_traffic_resume, RPT_COL_CAMPAIGN_STATUS] = inputDf.loc[should_traffic_resume, RPT_COL_SBA_RECOMMENDED_STATUS]
inputDf.loc[should_traffic_resume, RPT_COL_SBA_PAUSE_DATE] = VAL_BLANK

## Cleanup. "Pause Date" is a marker to indicate this Script actioned the Pause. If not Paused, for whatever reason, then a non-blank "Pause Date" causes confusion. 

orphan_pause_date = sba_paused_campaigns & (inputDf[RPT_COL_CAMPAIGN_STATUS] == VAL_STATUS_ACTIVE)
inputDf.loc[orphan_pause_date, RPT_COL_SBA_PAUSE_DATE] = VAL_BLANK
print(f"Cleaned up {orphan_pause_date.sum()} orphaned Pause Date")


## Prepare output

print(f"select_changed with inputDf shape {inputDf.shape} and originalDf shape {originalDf.shape}")

# only include changed rows in bulk file
(outputDf, debugDf) = select_changed(inputDf, \
                                    originalDf, \
                                    diff_cols = [ \
                                        RPT_COL_CAMPAIGN_STATUS, \
                                        RPT_COL_SBA_RECOMMENDED_STATUS, \
                                        RPT_COL_SBA_PAUSE_DATE, \
                                    ], \
                                    select_cols = [ \
                                        RPT_COL_ACCOUNT, \
                                        RPT_COL_CAMPAIGN, \
                                        RPT_COL_CAMPAIGN_STATUS, \
                                        RPT_COL_SBA_RECOMMENDED_STATUS, \
                                        RPT_COL_SBA_PAUSE_DATE, \
                                    ], \
                                    merged_cols=[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN] \
                                    )


# mark changed rows in debugDf
outputDf = outputDf.reset_index(drop=True) \
                   .set_index([RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN])
debugDf = debugDf.reset_index(drop=True) \
                 .set_index([RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN])
debugDf.loc[outputDf.index, 'changed'] = 1
outputDf = outputDf.reset_index()
debugDf = debugDf.reset_index()

# remember to use Bulk column header for Status
outputDf = outputDf.rename(columns = { \
                RPT_COL_CAMPAIGN_STATUS: BULK_COL_STATUS \
                })

print("outputDf shape", outputDf.shape)
print("sample outputDf", tableize(outputDf.tail(5)))

## local debug
if local_dev:
    output_filename = 'outputDf.csv'
    outputDf.to_csv(output_filename, index=False)
    print(f"Local Dev: Output written to: {output_filename}")

    debug_filename = 'debugDf.csv'
    debugDf.to_csv(debug_filename, index=False)
    print(f"Local Dev: Debug written to: {debug_filename}")

Post generated on 2025-03-11 01:25:51 GMT

31 Oct 2023

« Script 453: Automate ProductGroup SetBidOverride Script 469: Automate Campaigns CampaignCategoryDim »

MarinOne Scripts Creator's Corner