Script 557: Dimension Tags from Campaign Name
Purpose
The script extracts and updates specific campaign details from campaign names in a dataset.
To Elaborate
The Python script is designed to parse and extract specific details from campaign names within a dataset. It focuses on identifying and extracting start and end dates, goals, target impressions or spend/views, and CPM impression targets embedded within the campaign names. The script then compares these extracted details with existing data to identify any changes. If discrepancies are found, it updates the dataset with the new information. This process is crucial for maintaining accurate and up-to-date campaign records, ensuring that any changes in campaign parameters are reflected in the dataset. The script is particularly useful for marketing and advertising teams who need to manage and track campaign performance and allocations efficiently.
Walking Through the Code
- Data Preparation
- A sample DataFrame is created to simulate the data loading process. This DataFrame includes columns for client, campaign, pacing dates, account, goal, and targets, with some columns initialized with
NaN
values.
- A sample DataFrame is created to simulate the data loading process. This DataFrame includes columns for client, campaign, pacing dates, account, goal, and targets, with some columns initialized with
- Temporary Columns Initialization
- Temporary columns are added to the DataFrame to store the parsed campaign details. These columns are initialized with
NaN
values and will later hold the extracted start and end dates, goals, and targets.
- Temporary columns are added to the DataFrame to store the parsed campaign details. These columns are initialized with
- Function Definition
- The function
get_dates_and_tags_from_campaign_name
is defined to parse the campaign name using a regular expression pattern. It extracts the start and end dates, goal, target impressions/spend/views, and CPM impression target from the campaign name.
- The function
- Data Parsing
- The script iterates over each row in the DataFrame, applying the parsing function to extract the relevant details from the campaign name. The extracted values are stored in the corresponding temporary columns.
- Change Detection
- The script identifies rows where the parsed details differ from the existing data. It checks for changes in start and end dates, goals, and targets by comparing the temporary columns with the original columns.
- Output Preparation
- A new DataFrame,
outputDf
, is created to store only the rows with changed campaign details. The temporary columns are renamed to match the original column names for consistency.
- A new DataFrame,
- Result Display
- The script prints the original DataFrame with parsed tags and the DataFrame containing campaigns with changed tags, providing a clear view of the updates made.
Vitals
- Script ID : 557
- Client ID / Customer ID: 1306926843 / 60270139
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Goal, Pacing - End Date, Pacing - Start Date, CPM Imp Target, Target (Impr/Spend/Views)
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Jesus Garza (jgarza@marinsoftware.com)
- Created by Jesus Garza on 2023-12-04 16:09
- Last Updated by Jesus Garza on 2023-12-14 20:28
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
##
## name: Dimension Tags from Campaign Name
## description:
##
##
## author:
## created: 2023-12-04
##
today = datetime.datetime.now().date()
# Sample DataFrame creation (replace this with your data loading logic)
data = {
'Client': [np.nan],
'Campaign': ['10/2/2023-12/31/2023_CPM_1300_Target1_Budget10000_Impression50000'],
'Pacing - Start Date': [np.nan], # Add this line
'Pacing - End Date': [np.nan], # Add this line
'Account': [np.nan], # Add this line
'Goal': [np.nan], # Add this line
'Target (Impr/Spend/Views)': [np.nan], # Add this line
'CPM Imp Target': [np.nan], # Add this line
}
df = pd.DataFrame(data)
RPT_COL_CLIENT = 'Client'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_PACING_START_DATE = 'Pacing - Start Date'
RPT_COL_PACING_END_DATE = 'Pacing - End Date'
RPT_COL_GOAL = 'Goal'
RPT_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
RPT_COL_CPM_IMP_TARGET = 'CPM Imp Target'
BULK_COL_CLIENT = 'Client'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_PACING_START_DATE = 'Pacing - Start Date'
BULK_COL_PACING_END_DATE = 'Pacing - End Date'
BULK_COL_GOAL = 'Goal'
BULK_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
BULK_COL_CPM_IMP_TARGET = 'CPM Imp Target'
outputDf = pd.DataFrame(columns=[
BULK_COL_PACING_START_DATE,
BULK_COL_PACING_END_DATE,
BULK_COL_GOAL,
BULK_COL_TARGET_IMPR_PER_SPENDVIEWS,
BULK_COL_CPM_IMP_TARGET
])
# create temp columns to store new dim tags and default to empty
TMP_START_DATE = RPT_COL_PACING_START_DATE + '_'
TMP_END_DATE = RPT_COL_PACING_END_DATE + '_'
TMP_GOAL = RPT_COL_GOAL + '_'
TMP_TARGET_IMPR_PER_SPENDVIEWS = RPT_COL_TARGET_IMPR_PER_SPENDVIEWS + '_'
TMP_CPM_IMP_TARGET = RPT_COL_CPM_IMP_TARGET + '_'
df[TMP_START_DATE] = np.nan
df[TMP_END_DATE] = np.nan
df[TMP_GOAL] = np.nan
df[TMP_TARGET_IMPR_PER_SPENDVIEWS] = np.nan
df[TMP_CPM_IMP_TARGET] = np.nan
def get_dates_and_tags_from_campaign_name(row):
campaign_name = row[RPT_COL_CAMPAIGN]
# Adjusted pattern to match the campaign name format
pattern = r'(\d{1,2}/\d{1,2}/\d{4})-(\d{1,2}/\d{1,2}/\d{4})_([^_]+)_([^_]+)_Budget(\d+)_Impression(\d+)'
match = re.search(pattern, campaign_name)
if match:
start_date, end_date, goal, target_impr_spend_views, cpm_imp_target = match.groups()
return start_date, end_date, goal, target_impr_spend_views, cpm_imp_target
else:
return np.nan, np.nan, np.nan, np.nan, np.nan
# parse out Start/End Date, Goal, Target (Impr/Spend/Views), and CPM Imp Target from Campaign name
for index, row in df.iterrows():
start_date, end_date, goal, target_impr_spend_views, cpm_imp_target = get_dates_and_tags_from_campaign_name(row)
df.at[index, TMP_START_DATE] = start_date
df.at[index, TMP_END_DATE] = end_date
df.at[index, TMP_GOAL] = goal
df.at[index, TMP_TARGET_IMPR_PER_SPENDVIEWS] = target_impr_spend_views
df.at[index, TMP_CPM_IMP_TARGET] = cpm_imp_target
# find changed campaigns
changed = (
(df[TMP_START_DATE].notnull() & (df[RPT_COL_PACING_START_DATE] != df[TMP_START_DATE])) |
(df[TMP_END_DATE].notnull() & (df[RPT_COL_PACING_END_DATE] != df[TMP_END_DATE])) |
(df[TMP_GOAL].notnull() & (df[BULK_COL_GOAL] != df[TMP_GOAL])) |
(df[TMP_TARGET_IMPR_PER_SPENDVIEWS].notnull() & (df[BULK_COL_TARGET_IMPR_PER_SPENDVIEWS] != df[TMP_TARGET_IMPR_PER_SPENDVIEWS])) |
(df[TMP_CPM_IMP_TARGET].notnull() & (df[BULK_COL_CPM_IMP_TARGET] != df[TMP_CPM_IMP_TARGET]))
)
# only select changed rows
cols = [
RPT_COL_CLIENT, RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN,
TMP_START_DATE, TMP_END_DATE,
TMP_GOAL, TMP_TARGET_IMPR_PER_SPENDVIEWS, TMP_CPM_IMP_TARGET
]
outputDf = df.loc[changed, cols].copy() \
.rename(columns={
TMP_START_DATE: BULK_COL_PACING_START_DATE,
TMP_END_DATE: BULK_COL_PACING_END_DATE,
TMP_GOAL: BULK_COL_GOAL,
TMP_TARGET_IMPR_PER_SPENDVIEWS: BULK_COL_TARGET_IMPR_PER_SPENDVIEWS,
TMP_CPM_IMP_TARGET: BULK_COL_CPM_IMP_TARGET
})
print("inputDf with parsed tags", df)
print("== Campaigns with Changed Tag ==", outputDf)
Post generated on 2024-11-27 06:58:46 GMT