Script 557: Dimension Tags from Campaign Name

Purpose

The script extracts and updates specific campaign details from campaign names in a dataset.

To Elaborate

The Python script is designed to parse and extract specific details from campaign names within a dataset. It focuses on identifying and extracting start and end dates, goals, target impressions or spend/views, and CPM impression targets embedded within the campaign names. The script then compares these extracted details with existing data to identify any changes. If discrepancies are found, it updates the dataset with the new information. This process is crucial for maintaining accurate and up-to-date campaign records, ensuring that any changes in campaign parameters are reflected in the dataset. The script is particularly useful for marketing and advertising teams who need to manage and track campaign performance and allocations efficiently.

Walking Through the Code

  1. Data Preparation
    • A sample DataFrame is created to simulate the data loading process. This DataFrame includes columns for client, campaign, pacing dates, account, goal, and targets, with some columns initialized with NaN values.
  2. Temporary Columns Initialization
    • Temporary columns are added to the DataFrame to store the parsed campaign details. These columns are initialized with NaN values and will later hold the extracted start and end dates, goals, and targets.
  3. Function Definition
    • The function get_dates_and_tags_from_campaign_name is defined to parse the campaign name using a regular expression pattern. It extracts the start and end dates, goal, target impressions/spend/views, and CPM impression target from the campaign name.
  4. Data Parsing
    • The script iterates over each row in the DataFrame, applying the parsing function to extract the relevant details from the campaign name. The extracted values are stored in the corresponding temporary columns.
  5. Change Detection
    • The script identifies rows where the parsed details differ from the existing data. It checks for changes in start and end dates, goals, and targets by comparing the temporary columns with the original columns.
  6. Output Preparation
    • A new DataFrame, outputDf, is created to store only the rows with changed campaign details. The temporary columns are renamed to match the original column names for consistency.
  7. Result Display
    • The script prints the original DataFrame with parsed tags and the DataFrame containing campaigns with changed tags, providing a clear view of the updates made.

Vitals

  • Script ID : 557
  • Client ID / Customer ID: 1306926843 / 60270139
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Goal, Pacing - End Date, Pacing - Start Date, CPM Imp Target, Target (Impr/Spend/Views)
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Jesus Garza (jgarza@marinsoftware.com)
  • Created by Jesus Garza on 2023-12-04 16:09
  • Last Updated by Jesus Garza on 2023-12-14 20:28
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
##
## name: Dimension Tags from Campaign Name
## description:
##  
## 
## author: 
## created: 2023-12-04
## 

today = datetime.datetime.now().date()

# Sample DataFrame creation (replace this with your data loading logic)
data = {
    'Client': [np.nan],
    'Campaign': ['10/2/2023-12/31/2023_CPM_1300_Target1_Budget10000_Impression50000'],
    'Pacing - Start Date': [np.nan],  # Add this line
    'Pacing - End Date': [np.nan],  # Add this line
    'Account': [np.nan],  # Add this line
    'Goal': [np.nan],  # Add this line
    'Target (Impr/Spend/Views)': [np.nan],  # Add this line
    'CPM Imp Target': [np.nan],  # Add this line
}
df = pd.DataFrame(data)

RPT_COL_CLIENT = 'Client'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_PACING_START_DATE = 'Pacing - Start Date'
RPT_COL_PACING_END_DATE = 'Pacing - End Date'
RPT_COL_GOAL = 'Goal'
RPT_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
RPT_COL_CPM_IMP_TARGET = 'CPM Imp Target'
BULK_COL_CLIENT = 'Client'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_PACING_START_DATE = 'Pacing - Start Date'
BULK_COL_PACING_END_DATE = 'Pacing - End Date'
BULK_COL_GOAL = 'Goal'
BULK_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
BULK_COL_CPM_IMP_TARGET = 'CPM Imp Target'

outputDf = pd.DataFrame(columns=[
    BULK_COL_PACING_START_DATE,
    BULK_COL_PACING_END_DATE,
    BULK_COL_GOAL,
    BULK_COL_TARGET_IMPR_PER_SPENDVIEWS,
    BULK_COL_CPM_IMP_TARGET
])

# create temp columns to store new dim tags and default to empty
TMP_START_DATE = RPT_COL_PACING_START_DATE + '_'
TMP_END_DATE = RPT_COL_PACING_END_DATE + '_'
TMP_GOAL = RPT_COL_GOAL + '_'
TMP_TARGET_IMPR_PER_SPENDVIEWS = RPT_COL_TARGET_IMPR_PER_SPENDVIEWS + '_'
TMP_CPM_IMP_TARGET = RPT_COL_CPM_IMP_TARGET + '_'

df[TMP_START_DATE] = np.nan
df[TMP_END_DATE] = np.nan
df[TMP_GOAL] = np.nan
df[TMP_TARGET_IMPR_PER_SPENDVIEWS] = np.nan
df[TMP_CPM_IMP_TARGET] = np.nan

def get_dates_and_tags_from_campaign_name(row):
    campaign_name = row[RPT_COL_CAMPAIGN]
    # Adjusted pattern to match the campaign name format
    pattern = r'(\d{1,2}/\d{1,2}/\d{4})-(\d{1,2}/\d{1,2}/\d{4})_([^_]+)_([^_]+)_Budget(\d+)_Impression(\d+)'
    match = re.search(pattern, campaign_name)
    if match:
        start_date, end_date, goal, target_impr_spend_views, cpm_imp_target = match.groups()
        return start_date, end_date, goal, target_impr_spend_views, cpm_imp_target
    else:
        return np.nan, np.nan, np.nan, np.nan, np.nan

# parse out Start/End Date, Goal, Target (Impr/Spend/Views), and CPM Imp Target from Campaign name
for index, row in df.iterrows():
    start_date, end_date, goal, target_impr_spend_views, cpm_imp_target = get_dates_and_tags_from_campaign_name(row)
    df.at[index, TMP_START_DATE] = start_date
    df.at[index, TMP_END_DATE] = end_date
    df.at[index, TMP_GOAL] = goal
    df.at[index, TMP_TARGET_IMPR_PER_SPENDVIEWS] = target_impr_spend_views
    df.at[index, TMP_CPM_IMP_TARGET] = cpm_imp_target

# find changed campaigns
changed = (
    (df[TMP_START_DATE].notnull() & (df[RPT_COL_PACING_START_DATE] != df[TMP_START_DATE])) |
    (df[TMP_END_DATE].notnull() & (df[RPT_COL_PACING_END_DATE] != df[TMP_END_DATE])) |
    (df[TMP_GOAL].notnull() & (df[BULK_COL_GOAL] != df[TMP_GOAL])) |
    (df[TMP_TARGET_IMPR_PER_SPENDVIEWS].notnull() & (df[BULK_COL_TARGET_IMPR_PER_SPENDVIEWS] != df[TMP_TARGET_IMPR_PER_SPENDVIEWS])) |
    (df[TMP_CPM_IMP_TARGET].notnull() & (df[BULK_COL_CPM_IMP_TARGET] != df[TMP_CPM_IMP_TARGET]))
)

# only select changed rows
cols = [
    RPT_COL_CLIENT, RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN,
    TMP_START_DATE, TMP_END_DATE,
    TMP_GOAL, TMP_TARGET_IMPR_PER_SPENDVIEWS, TMP_CPM_IMP_TARGET
]
outputDf = df.loc[changed, cols].copy() \
    .rename(columns={
        TMP_START_DATE: BULK_COL_PACING_START_DATE,
        TMP_END_DATE: BULK_COL_PACING_END_DATE,
        TMP_GOAL: BULK_COL_GOAL,
        TMP_TARGET_IMPR_PER_SPENDVIEWS: BULK_COL_TARGET_IMPR_PER_SPENDVIEWS,
        TMP_CPM_IMP_TARGET: BULK_COL_CPM_IMP_TARGET
    })

print("inputDf with parsed tags", df)
print("== Campaigns with Changed Tag ==", outputDf)

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus