Script 1523: Script AutoTag Campaign ID Number

Purpose:

The script extracts and tags the numeric value before the first dash in a campaign name, removing any parentheses.

To Elaborate

The Python script is designed to process a dataset containing campaign information, specifically focusing on extracting a numeric identifier from the campaign name. The script identifies the numeric value enclosed in parentheses before the first dash in the campaign name and assigns it as the “Campaign ID #”. This operation is crucial for organizing and tagging campaigns with a unique identifier, which can be used for further analysis or reporting. The script ensures that only valid numeric identifiers are tagged and removes any unnecessary whitespace from the campaign names to maintain data consistency.

Walking Through the Code

  1. Configurable Parameters:
    • The script begins by defining a configurable parameter PLACEMENT_KEY, which is set to the dash character (‘-‘). This key is used to identify the position in the campaign name where the numeric value should be extracted.
  2. Data Preparation:
    • The primary data source is loaded into inputDf, which contains columns for campaign and account information. The script prepares an output DataFrame outputDf by copying the input data.
  3. Function Definition:
    • A function get_value_before_dash is defined to extract the numeric value from the campaign name. It uses a regular expression to match and extract numbers enclosed in parentheses.
  4. Data Processing:
    • The script iterates over each row in the input DataFrame. For each campaign name, it checks if the PLACEMENT_KEY is present. If so, it calls the get_value_before_dash function to extract the numeric value.
  5. Tagging and Cleaning:
    • If a valid numeric value is found, it is assigned to the “Campaign ID #” column in the output DataFrame. The script also removes any rows where the “Campaign ID #” is not available and strips extra whitespace from the campaign names.
  6. Output:
    • Finally, the script checks if the output DataFrame is empty and prints the first few rows if it contains data, ensuring that the processed information is ready for further use.

Vitals

  • Script ID : 1523
  • Client ID / Customer ID: 1306928453 / 60270539
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Campaign ID #
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
  • Created by Grégory Pantaine on 2024-11-14 17:23
  • Last Updated by emerryfield@marinsoftware.com on 2025-01-21 22:58
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
##
## name: Script AutoTag - Campaign ID Number
## description: Tags the value before the first - in the campaign name as the number without the brackets,
## ie: 5802 
## 
## author: G Pantaine with help from ChatGPT & M Huang.
## created: 2024-11-14
## 

# Configurable Params - START
PLACEMENT_KEY = '-'

# Primary data source and columns
inputDf = dataSourceDict["1"]

# Output columns and initial values
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_NUMERO_PROJET = 'Campaign ID #'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_NUMERO_PROJET = 'Campaign ID #'

# Function to extract the value before the first '-' and remove parentheses
def get_value_before_dash(campaign_name):
    match = re.match(r"\((\d+)\)", campaign_name)
    if match:
        return match.group(1)
    else:
        print("Value not found: " + campaign_name)
        return np.nan

# Copy all input rows to output
outputDf = inputDf.copy()

# Loop through all rows
for index, row in inputDf.iterrows():
    campaign_name = row[RPT_COL_CAMPAIGN]
    
    # Skip processing if campaign name does not contain the placement key
    if PLACEMENT_KEY not in campaign_name:
        continue

    value = get_value_before_dash(campaign_name)

    # Only tag if it's different than the existing tag
    if pd.notna(value):
        outputDf.at[index, BULK_COL_NUMERO_PROJET] = value
    else:
        outputDf.at[index, BULK_COL_NUMERO_PROJET] = np.nan

# Only include non-empty tags in bulk
outputDf = outputDf.dropna(subset=[BULK_COL_NUMERO_PROJET])

# Remove extra whitespace from campaign name that breaks Preview
outputDf[RPT_COL_CAMPAIGN] = outputDf[RPT_COL_CAMPAIGN].str.strip()

if not outputDf.empty:
    print("outputDf", outputDf.head().to_string())
else:
    print("Empty outputDf")

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus