Script 675: Campaign Name Dimension Auto Tagging Region
Purpose:
The script parses campaign names to automatically tag them with a region-specific dimension based on predefined rules.
To Elaborate
The Python script is designed to automate the process of tagging campaign names with a region-specific dimension. It extracts a tag from the campaign name, which appears after the first comma, and applies additional logic to refine the tag further. Specifically, it identifies and appends a region tag if the campaign name contains a specific pattern related to the USA and a local city. The script ensures that the extracted tag is different from any existing tag before updating it. This process helps in maintaining consistent and accurate tagging of campaigns, which is crucial for structured budget allocation (SBA) and reporting purposes.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining configurable parameters such as the separator (
SEP
) used to split the campaign name and the location of the tag (TAG_LOCATION
). - It specifies the primary data source (
inputDf
) and the relevant columns for processing, including campaign, account, and region.
- The script begins by defining configurable parameters such as the separator (
- Tag Extraction Function:
- A function
get_tag_from_campaign_name
is defined to extract the tag from the campaign name. - It splits the campaign name at the first comma and applies additional logic to extract specific patterns, such as ‘USA - [Local] - Nashville’.
- The function also removes any occurrence of ‘2017’ from the tag using a regular expression.
- A function
- Processing Data:
- The script copies all input rows to an output DataFrame (
outputDf
). - It iterates through each row of the input DataFrame, extracting the campaign name and existing tag.
- The script generates a new tag using the defined function and prints the campaign name and the new tag.
- The script copies all input rows to an output DataFrame (
- Updating Tags:
- The script checks if the newly generated tag is different from the existing tag and updates the region column in the output DataFrame accordingly.
- If the tag is unchanged or empty, it assigns a NaN value to the region column.
- Final Output:
- The script filters out rows with empty tags from the output DataFrame.
- It prints the final output DataFrame if it is not empty, ensuring only relevant data is retained for further processing.
Vitals
- Script ID : 675
- Client ID / Customer ID: 1306913045 / 60268001
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Region
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: dwaidhas@marinsoftware.com (dwaidhas@marinsoftware.com)
- Created by dwaidhas@marinsoftware.com on 2024-02-01 15:42
- Last Updated by dwaidhas@marinsoftware.com on 2024-02-01 18:10
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
##
## Name: Campaign Name - Dimension Auto Tagging: Region
## Description:
## Parse Campaign Name and add Campaign-level Marin Dimensions Tag for Region
## Tag appears after the first ',' comma in campaign name.
##
## author: Dana Waidhas
## created: 2024-01-31
##
########### Configurable Params - START ##########
SEP = ','
TAG_LOCATION = 1 # Comes after the first separator
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_REGION = 'Region'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_REGION = 'Region'
def get_tag_from_campaign_name(campaign_name):
# Split only at the first comma
vals = campaign_name.split(SEP, 1)
tag = vals[1].strip() if len(vals) > 1 else ''
# Additional logic to extract 'USA - [Local] - Nashville' from the campaign name
if ' - USA - [Local] - ' in campaign_name:
tag += 'USA - [Local] - ' + re.search(r'- USA - \[Local\] - (.*?) - ', campaign_name).group(1)
# Remove '2017' if present in the tag using regular expression
tag = re.sub(r'\b2017\b', '', tag)
return tag.strip()
# Copy all input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
existing_tag = row[RPT_COL_REGION]
campaign_name = row[RPT_COL_CAMPAIGN]
tag = get_tag_from_campaign_name(campaign_name)
print("Campaign [%s] => Tag [%s]" % (campaign_name, tag))
# Only tag if it's different than the existing tag
if (len(tag) > 0) & (tag != existing_tag):
outputDf.at[index, BULK_COL_REGION] = tag
else:
outputDf.at[index, BULK_COL_REGION] = np.nan
# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_REGION])
if not outputDf.empty:
print("outputDf", tableize(outputDf))
else:
print("Empty outputDf")
Post generated on 2025-03-11 01:25:51 GMT