Script 813: SBA Campaign Pacing with Rec. Budget and Current Budget

Purpose:

The Python script optimizes campaign budget allocation to minimize lost impression share due to budget constraints by calculating recommended daily budgets based on historical spend, remaining budget, and pacing compliance.

To Elaborate

The script addresses the challenge of efficiently allocating campaign budgets to minimize lost impression share due to budget limitations. It calculates recommended daily budgets for campaigns by considering factors such as historical spend, remaining budget, and pacing compliance. The script groups campaigns by Salesforce Item ID and allocates budgets based on remaining weekdays in the month, historical spend potential, and minimum daily budget requirements. It excludes inactive or expired campaigns and adjusts budgets to ensure they meet minimum daily requirements. The script aims to optimize budget allocation to improve campaign performance and reduce lost impression share.

Walking Through the Code

Initialization and Setup:
- The script begins by checking if it is running on a server or locally, loading necessary data from a pickle file if running locally.
- Configurable parameters, such as MINIMUM_DAILY_BUDGET, are defined, allowing users to adjust the minimum daily budget threshold.
Data Preparation:
- The script processes input data, converting necessary columns to appropriate data types and setting the index for grouping.
- It calculates the full potential spend by adjusting historical spend based on lost impression share due to budget constraints.
Aggregation and Filtering:
- Campaign data is aggregated by specific columns, calculating metrics like clicks, conversions, and spend.
- Inactive or expired campaigns are filtered out to focus on active campaigns with recent spend.
Budget Allocation:
- The script calculates budget allocation ratios, capping full potential spend at twice the historical spend.
- Remaining budget is allocated to campaigns based on calculated ratios.
Daily Budget Calculation:
- Recommended daily budgets are calculated by dividing allocated budgets by remaining weekdays in the month.
- Budgets below the minimum threshold are adjusted to meet the minimum daily budget requirement.
Output Generation:
- The script identifies changes in recommended budgets and generates an output dataframe with updated budget allocations for campaigns with detected changes.

Vitals

Script ID : 813
Client ID / Customer ID: 1306926629 / 60270083
Action Type: Bulk Upload
Item Changed: Campaign
Output Columns: Account, Campaign, Daily Budget, Rec. Daily Budget, SBA Allocation, SBA Budget Pacing, Current Daily Budget, Budget Difference
Linked Datasource: M1 Report
Reference Datasource: None
Owner: dwaidhas@marinsoftware.com (dwaidhas@marinsoftware.com)
Created by dwaidhas@marinsoftware.com on 2024-03-14 15:20
Last Updated by dwaidhas@marinsoftware.com on 2024-04-02 09:21

> See it in Action

Python Code

#
# SBA Campaign Budget Pacing - Minimize Lost IS (Budget)
#
# Allocates according to:
# * Remaining budget for each Salesforce Item ID Budget Group
# * Remaining weekdays in month
# * Historical spend and spend potential
# * Campaigns with spend in lookback period
# * Minimum daily budget
#
# Author: Dana Waidhas
#
# Created: 2024-02-26
#

##### Configurable Param #####

MINIMUM_DAILY_BUDGET = 10

##############################

########### START - Local Mode Config ###########
# Step 1: Uncomment download_preview_input flag and run Preview successfully with the Datasources you want
download_preview_input=False
# Step 2: In MarinOne, go to Scripts -> Preview -> Logs, download 'dataSourceDict' pickle file, and update pickle_path below
# pickle_path = ''
pickle_path = '/Users/mhuang/Downloads/pickle/avb_marketing_datasource_dict_1702622906522.pkl'
# Step 3: Copy this script into local IDE with Python virtual env loaded with pandas and numpy.
# Step 4: Run locally with below code to init dataSourceDict

# determine if code is running on server or locally
def is_executing_on_server():
    try:
        # Attempt to access a known restricted builtin
        dict_items = dataSourceDict.items()
        return True
    except NameError:
        # NameError: dataSourceDict object is missing (indicating not on server)
        return False

if is_executing_on_server():
    print("Code is executing on server. Skip init.")
elif len(pickle_path) > 3:
    print("Code is NOT executing on server. Doing init.")
    # load dataSourceDict via pickled file
    import pickle
    dataSourceDict = pickle.load(open(pickle_path, 'rb'))

    # print shape and first 5 rows for each entry in dataSourceDict
    for key, value in dataSourceDict.items():
        print(f"Shape of dataSourceDict[{key}]: {value.shape}")
        # print(f"First 5 rows of dataSourceDict[{key}]:\n{value.head(5)}")

    # set outputDf same as inputDf
    inputDf = dataSourceDict["1"]
    outputDf = inputDf.copy()

    # setup timezone
    import datetime
    # Chicago Timezone is GMT-5. Adjust as needed.
    CLIENT_TIMEZONE = datetime.timezone(datetime.timedelta(hours=-5))

    # import pandas
    import pandas as pd
    import numpy as np

    # other imports
    import re
    import urllib

    # import Marin util functions
else:
   from marin_scripts_utils import tableize, select_changed
   print("Running locally but no pickle path defined. dataSourceDict not loaded.")
   exit(1)
########### END - Local Mode Setup ###########



RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_DATE = 'Date'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_PUBLISHER_NAME = 'Publisher Name'
RPT_COL_STRATEGY = 'Strategy'
RPT_COL_DAILY_BUDGET = 'Daily Budget'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_CLICKS = 'Clicks'
RPT_COL_CONV = 'Conv.'
RPT_COL_IMPR_SHARE = 'Impr. share %'
RPT_COL_LOST_IMPR_SHARE_BUDGET = 'Lost Impr. Share (Budget) %'
RPT_COL_LOST_IMPR_SHARE_RANK = 'Lost Impr. Share (Rank) %'
RPT_COL_SBA_STRATEGY = 'SBA Strategy'
RPT_COL_SBA_CAMPAIGN_BUDGET = 'SBA Campaign Budget'
RPT_COL_SBA_ALLOCATION = 'SBA Allocation'
RPT_COL_REC_DAILY_BUDGET = 'Rec. Daily Budget'
RPT_COL_SBA_BUDGET_PACING = 'SBA Budget Pacing'
RPT_COL_SBA_TRAFFIC = 'SBA Traffic'
RPT_COL_PROGRAM_END_Date = 'Program End Date'
RPT_COL_CURRENT_DAILY_BUDGET = 'Current Daily Budget'
RPT_COL_BUDGET_DIFFERENCE = 'Budget Difference'

BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_DAILY_BUDGET = 'Daily Budget'
BULK_COL_SBA_ALLOCATION = 'SBA Allocation'
BULK_COL_SBA_BUDGET_PACING = 'SBA Budget Pacing'
BULK_COL_REC_DAILY_BUDGET = 'Rec. Daily Budget'
BULK_COL_CURRENT_DAILY_BUDGET = 'Current Daily Budget'
BULK_COL_BUDGET_DIFFERENCE = 'Budget Difference'


COL_SPEND_FULL_POTENTIAL = 'spend_lookback_full_potential'
COL_SPEND_FULL_POTENTIAL_CAPPED = 'spend_lookback_full_potential_capped'
COL_SPEND_MTD = 'spend_mtd'
COL_SBA_ALLOCATION_NEW_FLOAT = RPT_COL_SBA_ALLOCATION + '_new_float'
COL_SBA_ALLOCATION_NEW = RPT_COL_SBA_ALLOCATION + '_new'
COL_SBA_STRATEGY_BUDGET_REMAINING = 'SBA_Campaign_budget_remaining'
COL_SBA_BUDGET_PACING_NEW = RPT_COL_SBA_BUDGET_PACING + '_new'
COL_BUDGET_REMAINING = 'budget_remaining'
COL_DAILY_BUDGET_NEW = RPT_COL_DAILY_BUDGET + '_new'
COL_REC_DAILY_BUDGET_NEW = RPT_COL_REC_DAILY_BUDGET + '_new'
COL_CURRENT_DAILY_BUDGET_NEW = RPT_COL_CURRENT_DAILY_BUDGET + '_new'
COL_BUDGET_DIFFERENCE_NEW = RPT_COL_BUDGET_DIFFERENCE + '_new'

COL_DAYS_REMAINING= 'weekdays_remaining'
COL_DAYS_TOTAL= 'weekdays_total'
COL_PACING_CALC = 'pacing_calc'


outputDf[BULK_COL_DAILY_BUDGET] = "<<YOUR VALUE>>"

today = datetime.datetime.now(CLIENT_TIMEZONE).date()
print("inputDf shape", inputDf.shape)
print("inputDf dtypes", inputDf.dtypes)

# change back to percent string
if inputDf[RPT_COL_SBA_ALLOCATION].dtype == "float":
    inputDf[RPT_COL_SBA_ALLOCATION] = round(inputDf[RPT_COL_SBA_ALLOCATION] * 100.0, 0).astype(str) + '%'
if inputDf[BULK_COL_SBA_BUDGET_PACING].dtype == "float":
    inputDf[BULK_COL_SBA_BUDGET_PACING] = round(inputDf[BULK_COL_SBA_BUDGET_PACING] * 100.0, 0).astype(str) + '%'

# coerce Program End Date into Date type
inputDf[RPT_COL_PROGRAM_END_Date] = pd.to_datetime(inputDf[RPT_COL_PROGRAM_END_Date], errors='coerce')

inputDf = inputDf.set_index([RPT_COL_SBA_STRATEGY])
group_by_salesforce_item_ID = inputDf.groupby(RPT_COL_SBA_STRATEGY)

# ## Calculate Full-Potential Spend
# * Adjust Historical Spend by _Lost Impression Share due to Budget_ (see [Formula](https://docs.google.com/document/d/1EbCQ5z9Up8TZ6GISEeCaRSB3Fc15vCPCfeIydree23M/edit#bookmark=id.5fsx7jlseze6))
# 

adj_ratio = 1 + (inputDf[RPT_COL_LOST_IMPR_SHARE_BUDGET] / (1 - inputDf[RPT_COL_LOST_IMPR_SHARE_BUDGET]))

inputDf[COL_SPEND_FULL_POTENTIAL] = round(inputDf[RPT_COL_PUB_COST] * adj_ratio, 2)

# ## Remove Date Segmentation
# * Calculate MTD Spend

# SUM Series with Date index and only includes current month
def current_month_sum(x):
    x = x.sort_index()
    mtd = x[ (x.index.month == today.month) & (x.values > 0)]
    return mtd.sum()

groupby_cols = [ \
    RPT_COL_SBA_STRATEGY, \
    RPT_COL_STRATEGY, \
    RPT_COL_PUBLISHER_NAME, \
    RPT_COL_ACCOUNT, \
    RPT_COL_CAMPAIGN, \
]


agg_spec = {
    RPT_COL_CAMPAIGN_STATUS: 'last', \
    RPT_COL_DAILY_BUDGET: 'last', \
    RPT_COL_SBA_CAMPAIGN_BUDGET: 'last', \
    RPT_COL_SBA_ALLOCATION: 'last', \
    RPT_COL_REC_DAILY_BUDGET: 'last', \
    RPT_COL_CURRENT_DAILY_BUDGET: 'last', \
    RPT_COL_BUDGET_DIFFERENCE: 'last', \
    RPT_COL_SBA_BUDGET_PACING: 'last', \
    RPT_COL_SBA_TRAFFIC: 'last', \
    RPT_COL_CLICKS: 'sum', \
    RPT_COL_CONV: 'sum', \
    RPT_COL_PUB_COST: 'sum', \
    COL_SPEND_MTD: current_month_sum, \
    COL_SPEND_FULL_POTENTIAL: 'sum', \
    RPT_COL_PROGRAM_END_Date: 'last', \
}

inputDf[COL_SPEND_MTD] = inputDf[RPT_COL_PUB_COST]

df_campaign_agg = inputDf.reset_index() \
                         .set_index(RPT_COL_DATE) \
                         .groupby(groupby_cols) \
                         .agg(agg_spec) \
                         .reset_index() \
                         .set_index(RPT_COL_SBA_STRATEGY)


# ## Only allocate budget for recently trafficking campaigns
# * Exclude Campaigns that are:
# ** not ACTIVE 
# ** without spend in lookback period
# ** Program Date is in the past

inactive_campaigns = (df_campaign_agg[RPT_COL_CAMPAIGN_STATUS] != 'Active') & (df_campaign_agg[RPT_COL_PUB_COST] == 0)
expired_campaigns = df_campaign_agg[RPT_COL_PROGRAM_END_Date].notnull() & (df_campaign_agg[RPT_COL_PROGRAM_END_Date] < pd.to_datetime(today))
df_campaign_agg = df_campaign_agg.loc[ ~(inactive_campaigns | expired_campaigns) ]


# ## Calculate Budget Allocation Ratio 
# * Cap full potential spend at 2X (don't spend twice as much as before)
# * Compare full potential spend for each campaign to total spend within same SALESFORCE_ITEM_ID budget group

df_campaign_agg[COL_SPEND_FULL_POTENTIAL_CAPPED] = df_campaign_agg \
    .apply(lambda row: min(row[COL_SPEND_FULL_POTENTIAL], 2 * row[RPT_COL_PUB_COST]), axis=1)


# use transform to calculate sum for each SALESFORCE_ITEM_ID and make it available on every row
# note: no need to build aggregate DataFrame and JOIN back to original

df_campaign_agg[COL_SBA_ALLOCATION_NEW_FLOAT] = 100.0 * \
        df_campaign_agg[COL_SPEND_FULL_POTENTIAL_CAPPED] / \
        df_campaign_agg.groupby(RPT_COL_SBA_STRATEGY)[COL_SPEND_FULL_POTENTIAL_CAPPED].transform('sum')

df_campaign_agg[COL_SBA_ALLOCATION_NEW] = round(df_campaign_agg[COL_SBA_ALLOCATION_NEW_FLOAT],0).astype(str) + '%'


# 
# ## Calculate Remaining Budget
# * For each SBA Strategy budget group, calculate how much Budget is left by substracting SBA Monthly budget from MTD SALESFORCE_ITEM_ID spend

df_campaign_agg[COL_SBA_STRATEGY_BUDGET_REMAINING] =  \
        df_campaign_agg[RPT_COL_SBA_CAMPAIGN_BUDGET] - \
        df_campaign_agg.groupby(by=[RPT_COL_SBA_STRATEGY])[COL_SPEND_MTD].sum()


# ## Allocate Budget
# * Allocate remaining budget to each campaign according to ratio calculated above

df_campaign_agg[COL_BUDGET_REMAINING] = round(df_campaign_agg[COL_SBA_STRATEGY_BUDGET_REMAINING] * df_campaign_agg[COL_SBA_ALLOCATION_NEW_FLOAT] / 100.0, 1)


# ## Calculate SBA Daily Budget
# 
# * Calcualte next day Daily Budget by dividing allocated budget by number of Days left in the current month


today_numpy = pd.to_datetime(today).to_numpy().astype('datetime64[D]')
next_month_start = (today_numpy + pd.offsets.BMonthBegin()).to_numpy().astype('datetime64[D]')

# for months ending on weekends, use max(1,x) to avoid dividing by zero
days_left = max(1, (next_month_start - today_numpy).astype('timedelta64[D]').astype(int))

df_campaign_agg[COL_DAYS_REMAINING] = days_left
df_campaign_agg[COL_REC_DAILY_BUDGET_NEW] = round(df_campaign_agg[COL_BUDGET_REMAINING] / days_left, 0)

# ### Apply Minimum Rule
# * Bump allocated budget above minimum

allocated_below_min = (df_campaign_agg[COL_REC_DAILY_BUDGET_NEW] < MINIMUM_DAILY_BUDGET)
df_campaign_agg.loc[allocated_below_min, COL_REC_DAILY_BUDGET_NEW] = MINIMUM_DAILY_BUDGET


# ### Traffic Budget
df_campaign_agg[COL_DAILY_BUDGET_NEW] = np.nan

# campaigns to traffic
to_traffic = df_campaign_agg[RPT_COL_SBA_TRAFFIC].notnull() & \
            (df_campaign_agg[RPT_COL_SBA_TRAFFIC].astype(str).str.lower() == 'traffic')
print("Traffic count", to_traffic.sum())

# preserve current budget in Dimensions field
df_campaign_agg[COL_CURRENT_DAILY_BUDGET_NEW] = df_campaign_agg[RPT_COL_DAILY_BUDGET]
# copy rec budget to publisher daily budget for trafficking campaigns
df_campaign_agg.loc[to_traffic, COL_DAILY_BUDGET_NEW] = df_campaign_agg.loc[to_traffic, COL_REC_DAILY_BUDGET_NEW]
# Calculate 'Budget Difference' column and round to two decimal places
df_campaign_agg[COL_BUDGET_DIFFERENCE_NEW] = round(df_campaign_agg[COL_REC_DAILY_BUDGET_NEW]-df_campaign_agg[RPT_COL_CURRENT_DAILY_BUDGET], 2)


# ## Calculate Salesforece Item ID -level Pacing compliance percentage. Ideally should be 100% each day.

# number of elapsed workdays
current_month_start = pd.to_datetime(today.replace(day=1)).to_numpy().astype('datetime64[D]')
total_days_in_month = (next_month_start - current_month_start).astype('timedelta64[D]').astype(int)
df_campaign_agg[COL_DAYS_TOTAL] = total_days_in_month
prorated_ratio = (total_days_in_month - days_left) / total_days_in_month

print("today", today)
print("current_month_start", current_month_start)
print("next_month_start", next_month_start)
print("weekdays_in_month", total_days_in_month)
print("weekdays_left", days_left)
print("prorated_ratio", prorated_ratio)

# divide MTD spend by prorated total budget
mask = df_campaign_agg[RPT_COL_SBA_CAMPAIGN_BUDGET] > 0
df_campaign_agg[COL_PACING_CALC] = round(100.0 * \
                                    df_campaign_agg.groupby(by=[RPT_COL_SBA_STRATEGY])[COL_SPEND_MTD].sum() / \
                                    (prorated_ratio * df_campaign_agg[RPT_COL_SBA_CAMPAIGN_BUDGET]), \
                                    0).astype(str) + '%'
df_campaign_agg.loc[mask, COL_SBA_BUDGET_PACING_NEW] = df_campaign_agg.loc[mask, COL_PACING_CALC]



# Debug DF with full details
df_SALESFORCE_ITEM_ID_budget = group_by_salesforce_item_ID[[RPT_COL_SBA_CAMPAIGN_BUDGET]].transform('max').dropna().drop_duplicates()
print("Salesforce Item ID budgets", df_SALESFORCE_ITEM_ID_budget.head().to_string())


# ## Generate outputDf
# Check for changes
changed = df_campaign_agg[COL_REC_DAILY_BUDGET_NEW].notnull() & \
    ( \
       (df_campaign_agg[RPT_COL_REC_DAILY_BUDGET] != df_campaign_agg[COL_REC_DAILY_BUDGET_NEW]) | \
       (df_campaign_agg[RPT_COL_DAILY_BUDGET] != df_campaign_agg[COL_DAILY_BUDGET_NEW]) | \
       (df_campaign_agg[RPT_COL_SBA_ALLOCATION] != df_campaign_agg[COL_SBA_ALLOCATION_NEW]) | \
       (df_campaign_agg[RPT_COL_SBA_BUDGET_PACING] != df_campaign_agg[COL_SBA_BUDGET_PACING_NEW]) | \
       (df_campaign_agg[RPT_COL_CURRENT_DAILY_BUDGET] != df_campaign_agg[COL_CURRENT_DAILY_BUDGET_NEW]) | \
       (df_campaign_agg[RPT_COL_BUDGET_DIFFERENCE] != df_campaign_agg[COL_BUDGET_DIFFERENCE_NEW]) \
    )

print("Changed rows:", changed.sum())

# Debug
debugDf = df_campaign_agg.loc[changed] \
      .reset_index() \
      .sort_values(by=[RPT_COL_SBA_STRATEGY, COL_DAILY_BUDGET_NEW, COL_REC_DAILY_BUDGET_NEW], ascending=False) 

# print("debugDf", tableize(debugDf))

# Only emit output for changed campaigns
if changed.sum() > 0:

    # construct outputDf
    outputDf = df_campaign_agg.loc[changed, [RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN, 
                                     COL_DAILY_BUDGET_NEW, COL_REC_DAILY_BUDGET_NEW, 
                                     COL_SBA_BUDGET_PACING_NEW, COL_SBA_ALLOCATION_NEW,
                                     COL_CURRENT_DAILY_BUDGET_NEW, COL_BUDGET_DIFFERENCE_NEW]] \
                  .copy() \
                  .rename(columns={ \
                        COL_DAILY_BUDGET_NEW: BULK_COL_DAILY_BUDGET, \
                        COL_REC_DAILY_BUDGET_NEW: BULK_COL_REC_DAILY_BUDGET, \
                        COL_SBA_BUDGET_PACING_NEW: BULK_COL_SBA_BUDGET_PACING, \
                        COL_SBA_ALLOCATION_NEW: BULK_COL_SBA_ALLOCATION, \
                        COL_CURRENT_DAILY_BUDGET_NEW: BULK_COL_CURRENT_DAILY_BUDGET, \
                        COL_BUDGET_DIFFERENCE_NEW: BULK_COL_BUDGET_DIFFERENCE, \
                    }) \
                  .reset_index() \
                  .sort_values(by=[RPT_COL_SBA_STRATEGY, BULK_COL_DAILY_BUDGET, BULK_COL_REC_DAILY_BUDGET], ascending=False) \
                  .drop(RPT_COL_SBA_STRATEGY, axis=1)


    print("outputDf shape", outputDf.shape)
    print("outputDf", tableize(outputDf.head()))
else:
    print("No changes detected, returning an empty dataframe")
    outputDf = pd.DataFrame(columns=[BULK_COL_ACCOUNT, BULK_COL_CAMPAIGN, BULK_COL_DAILY_BUDGET, BULK_COL_REC_DAILY_BUDGET, BULK_COL_SBA_BUDGET_PACING, BULK_COL_SBA_ALLOCATION, BULK_COL_CURRENT_DAILY_BUDGET, BULK_COL_BUDGET_DIFFERENCE])
    

Post generated on 2025-03-11 01:25:51 GMT

14 Mar 2024

« Script 811: Campaigns Anomaly Detection 4 All Promos Script 817: SBA Budget Staging via GSheets »

MarinOne Scripts Creator's Corner