Script 675: Campaign Name Dimension Auto Tagging Region
Purpose
The script parses campaign names to automatically tag them with a region dimension based on specific patterns.
To Elaborate
The Python script is designed to automate the process of tagging campaign names with a region dimension. It extracts a tag from the campaign name, which appears after the first comma, and appends additional information if the campaign name contains a specific pattern related to the USA and local regions. The script ensures that the extracted tag is different from any existing tag before updating it. This process helps in maintaining consistent and accurate tagging of campaigns, which is crucial for reporting and analysis purposes. The script also removes any tags containing the year “2017” to keep the data relevant and up-to-date.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining configurable parameters such as the separator (
SEP
) used to split the campaign name and the location of the tag (TAG_LOCATION
). - It specifies the primary data source and the columns used for input and output.
- The script begins by defining configurable parameters such as the separator (
- Tag Extraction Function:
- A function
get_tag_from_campaign_name
is defined to extract the tag from the campaign name. - It splits the campaign name at the first comma and checks for specific patterns to append additional information.
- The function also removes any occurrence of the year “2017” from the tag.
- A function
- Data Processing:
- The script copies all input rows to an output DataFrame.
- It iterates through each row, extracting the campaign name and existing tag.
- The script generates a new tag using the defined function and compares it with the existing tag.
- If the new tag is different and non-empty, it updates the output DataFrame; otherwise, it sets the region column to NaN.
- Output Filtering:
- The script filters the output DataFrame to include only rows with non-empty tags.
- It prints the final output DataFrame or indicates if it is empty.
Vitals
- Script ID : 675
- Client ID / Customer ID: 1306913045 / 60268001
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Region
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: dwaidhas@marinsoftware.com (dwaidhas@marinsoftware.com)
- Created by dwaidhas@marinsoftware.com on 2024-02-01 15:42
- Last Updated by dwaidhas@marinsoftware.com on 2024-02-01 18:10
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
##
## Name: Campaign Name - Dimension Auto Tagging: Region
## Description:
## Parse Campaign Name and add Campaign-level Marin Dimensions Tag for Region
## Tag appears after the first ',' comma in campaign name.
##
## author: Dana Waidhas
## created: 2024-01-31
##
########### Configurable Params - START ##########
SEP = ','
TAG_LOCATION = 1 # Comes after the first separator
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_REGION = 'Region'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_REGION = 'Region'
def get_tag_from_campaign_name(campaign_name):
# Split only at the first comma
vals = campaign_name.split(SEP, 1)
tag = vals[1].strip() if len(vals) > 1 else ''
# Additional logic to extract 'USA - [Local] - Nashville' from the campaign name
if ' - USA - [Local] - ' in campaign_name:
tag += 'USA - [Local] - ' + re.search(r'- USA - \[Local\] - (.*?) - ', campaign_name).group(1)
# Remove '2017' if present in the tag using regular expression
tag = re.sub(r'\b2017\b', '', tag)
return tag.strip()
# Copy all input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
existing_tag = row[RPT_COL_REGION]
campaign_name = row[RPT_COL_CAMPAIGN]
tag = get_tag_from_campaign_name(campaign_name)
print("Campaign [%s] => Tag [%s]" % (campaign_name, tag))
# Only tag if it's different than the existing tag
if (len(tag) > 0) & (tag != existing_tag):
outputDf.at[index, BULK_COL_REGION] = tag
else:
outputDf.at[index, BULK_COL_REGION] = np.nan
# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_REGION])
if not outputDf.empty:
print("outputDf", tableize(outputDf))
else:
print("Empty outputDf")
Post generated on 2024-11-27 06:58:46 GMT