Script 1525: Script AutoTag Campaign Strategy
Purpose
The Python script extracts and tags a specific segment from campaign names based on a defined delimiter.
To Elaborate
The script is designed to process a dataset containing campaign information, specifically focusing on extracting a segment of the campaign name that is situated between the first and second occurrence of a specified delimiter, which is a dash (‘-‘). This extracted segment is then tagged as the “Campaign Strategy” for each campaign. The script ensures that only campaigns with at least two dashes in their names are processed, and it updates the dataset with the extracted strategy only if it differs from any existing tag. The final output includes only those campaigns that have a non-empty “Campaign Strategy” tag, ensuring that the data is clean and relevant for further analysis or reporting.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining a configurable parameter
PLACEMENT_KEY
, which is set to a dash (‘-‘). This key is used to split the campaign names.
- The script begins by defining a configurable parameter
- Data Preparation:
- The primary data source is loaded into
inputDf
, which contains the campaign data. - The script prepares an output DataFrame
outputDf
by copying all rows frominputDf
.
- The primary data source is loaded into
- Function Definition:
- A function
get_value_between_dashes
is defined to extract the segment between the first and second dashes in a campaign name. If the name does not contain at least two dashes, it returnsNaN
.
- A function
- Processing Each Campaign:
- The script iterates over each row in
inputDf
. For each campaign name, it checks if there are at least two dashes. - If so, it uses the
get_value_between_dashes
function to extract the strategy segment. - The extracted value is then compared to the existing tag, and if it is valid (not
NaN
), it updates theCampaign Strategy
column inoutputDf
.
- The script iterates over each row in
- Finalizing Output:
- The script removes any rows from
outputDf
where theCampaign Strategy
is empty. - It also strips any extra whitespace from the campaign names to ensure data cleanliness.
- The script removes any rows from
Vitals
- Script ID : 1525
- Client ID / Customer ID: 1306928453 / 60270539
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Campaign Strategy
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-11-14 17:24
- Last Updated by Grégory Pantaine on 2024-11-14 17:34
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
##
## name: Script AutoTag - Campaign Strategy
## description: Tags the value after the first - and before the second - in the campaign name,
## ie: Credit Cards
##
## author: G Pantaine with help from ChatGPT & M Huang.
## created: 2024-11-14
##
# Configurable Params - START
PLACEMENT_KEY = '-'
# Primary data source and columns
inputDf = dataSourceDict["1"]
# Output columns and initial values
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN_STRATEGY = 'Campaign Strategy'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_CAMPAIGN_STRATEGY = 'Campaign Strategy'
# Function to extract the value after the first '-' and before the second '-'
def get_value_between_dashes(campaign_name):
parts = campaign_name.split(PLACEMENT_KEY)
if len(parts) > 2:
return parts[1].strip()
else:
return np.nan
# Copy all input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
campaign_name = row[RPT_COL_CAMPAIGN]
# Skip processing if campaign name does not contain at least two placement keys
if campaign_name.count(PLACEMENT_KEY) < 2:
continue
value = get_value_between_dashes(campaign_name)
# Only tag if it's different than the existing tag
if pd.notna(value):
outputDf.at[index, BULK_COL_CAMPAIGN_STRATEGY] = value
else:
outputDf.at[index, BULK_COL_CAMPAIGN_STRATEGY] = np.nan
# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_CAMPAIGN_STRATEGY])
# Remove extra whitespace from campaign name that breaks Preview
outputDf[RPT_COL_CAMPAIGN] = outputDf[RPT_COL_CAMPAIGN].str.strip()
if not outputDf.empty:
print("outputDf", outputDf.head().to_string())
else:
print("Empty outputDf")
Post generated on 2024-11-27 06:58:46 GMT