Script 1525: Script AutoTag Campaign Strategy

Purpose

The Python script extracts and tags a specific segment from campaign names based on a defined delimiter.

To Elaborate

The script is designed to process a dataset containing campaign information, specifically focusing on extracting a segment of the campaign name that is situated between the first and second occurrence of a specified delimiter, which is a dash (‘-‘). This extracted segment is then tagged as the “Campaign Strategy” for each campaign. The script ensures that only campaigns with at least two dashes in their names are processed, and it updates the dataset with the extracted strategy only if it differs from any existing tag. The final output includes only those campaigns that have a non-empty “Campaign Strategy” tag, ensuring that the data is clean and relevant for further analysis or reporting.

Walking Through the Code

  1. Configurable Parameters:
    • The script begins by defining a configurable parameter PLACEMENT_KEY, which is set to a dash (‘-‘). This key is used to split the campaign names.
  2. Data Preparation:
    • The primary data source is loaded into inputDf, which contains the campaign data.
    • The script prepares an output DataFrame outputDf by copying all rows from inputDf.
  3. Function Definition:
    • A function get_value_between_dashes is defined to extract the segment between the first and second dashes in a campaign name. If the name does not contain at least two dashes, it returns NaN.
  4. Processing Each Campaign:
    • The script iterates over each row in inputDf. For each campaign name, it checks if there are at least two dashes.
    • If so, it uses the get_value_between_dashes function to extract the strategy segment.
    • The extracted value is then compared to the existing tag, and if it is valid (not NaN), it updates the Campaign Strategy column in outputDf.
  5. Finalizing Output:
    • The script removes any rows from outputDf where the Campaign Strategy is empty.
    • It also strips any extra whitespace from the campaign names to ensure data cleanliness.

Vitals

  • Script ID : 1525
  • Client ID / Customer ID: 1306928453 / 60270539
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Campaign Strategy
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
  • Created by Grégory Pantaine on 2024-11-14 17:24
  • Last Updated by Grégory Pantaine on 2024-11-14 17:34
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
##
## name: Script AutoTag - Campaign Strategy
## description: Tags the value after the first - and before the second - in the campaign name,
## ie: Credit Cards 
## 
## author: G Pantaine with help from ChatGPT & M Huang.
## created: 2024-11-14
## 
# Configurable Params - START
PLACEMENT_KEY = '-'

# Primary data source and columns
inputDf = dataSourceDict["1"]

# Output columns and initial values
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN_STRATEGY = 'Campaign Strategy'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_CAMPAIGN_STRATEGY = 'Campaign Strategy'

# Function to extract the value after the first '-' and before the second '-'
def get_value_between_dashes(campaign_name):
    parts = campaign_name.split(PLACEMENT_KEY)
    if len(parts) > 2:
        return parts[1].strip()
    else:
        return np.nan

# Copy all input rows to output
outputDf = inputDf.copy()

# Loop through all rows
for index, row in inputDf.iterrows():
    campaign_name = row[RPT_COL_CAMPAIGN]
    
    # Skip processing if campaign name does not contain at least two placement keys
    if campaign_name.count(PLACEMENT_KEY) < 2:
        continue

    value = get_value_between_dashes(campaign_name)

    # Only tag if it's different than the existing tag
    if pd.notna(value):
        outputDf.at[index, BULK_COL_CAMPAIGN_STRATEGY] = value
    else:
        outputDf.at[index, BULK_COL_CAMPAIGN_STRATEGY] = np.nan

# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_CAMPAIGN_STRATEGY])

# Remove extra whitespace from campaign name that breaks Preview
outputDf[RPT_COL_CAMPAIGN] = outputDf[RPT_COL_CAMPAIGN].str.strip()

if not outputDf.empty:
    print("outputDf", outputDf.head().to_string())
else:
    print("Empty outputDf")

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus